Assign ops papercut with custom view types

bluss · May 23, 2020, 9:18am

The assign ops (Like +=, *= and so on) require the receiver to be a place and this leads to awkwardness with composite types like matrices and views thereof.

In ndarray we can use a method like .column_mut(0) to get a mutable view of the first column. Basically this is the multidimensional array version of making a mutable slice. Arrays and array views support arithmetic operations, so we'd like to be able to use the += operator.

What we want to write (this doesn't compile!)

values.column_mut(0) += 1.

What we have to write:

*&mut values.column_mut(0) += 1.;

(See this playground link - full example)

Can we do something about the rules for assignment operators to solve this papercut?

Users are unlikely to even find the one-liner solution to this, and the other one with a separate variable is also a papercut-type problem.

On the wishlist would be that

A. we can define += so that it can be used on temporary views like this.
B. Further, even on temporary views that are not even mutable. We want to fully support ArrayView<Cell<f64>, _> in arithmetic operations in the future - we definitely want to support += on shared views of Cell elements.

bluss · May 23, 2020, 9:23am

Assign ops on temporaries are misleading and don't always make sense:

string.len() += 1;
// What? Allowing this is nonsense and doesn't do what you'd think

So a rule would be needed that could allow this for view-type temporaries while still excluding other kinds of receivers, like the return value of String::len.

RustyYato · May 23, 2020, 10:23am

I don't see what the difference is between a view type and a normal temporary. At least, not a machine checkable difference.

Ixrec · May 23, 2020, 10:50am

IIUC this only makes basic conceptual sense if the "view type" returned by the method contains &mut Ts to some Ts inside a fully owned non-view type elsewhere. So cases like string.len() seem fairly easy to exclude or at least lint against on the basis that len() returns a usize, and it makes no sense to try and mutate a temporary unnamed usize since usize doesn't contain any &muts.

bluss · May 23, 2020, 1:17pm

Well, to solve the problem we can find out what changes are necessary to make to Rust, and make them. So if some kind of opt-in is needed, that could be implemented, for example. I'm open to any ideas about what would be a good way to solve the papercut.

bluss · May 23, 2020, 1:20pm

&mut or & to interior mutable data conceptually yes

With that said, ArrayViewMut does not contain any mutable references. It's implemented using raw pointers. (Edit: Ok, my bad, it does contain a PhantomData<&mut T>, so maybe that works!)

RustyYato · May 23, 2020, 3:03pm

This could work, maybe some marker trait or attribute? If we go the attribute route, we should allow assign ops on temporaries, but lint against it unless a type has the marker. If we go the trait route, we could make assign ops work on temporaries whose type implements the marker trait.

I think I lean more towards an attribute, something like #[place_type] (name bikeshed), where any type that has this attribute can use the sugar, temporary assign_op value; to mean the same as *&mut temporary assign_op value or *&temporary assign_op value depending on context.

H2CO3 · May 23, 2020, 4:11pm

I have a problem with the basic premise:

// I wish this could be written:
values.column_mut(0) += 1.;

I find this absolutely confusing. Exactly because it's assigning to a temporary. I don't think this should be allowed at all. To me, allowing this would be a strangeness paper cut.

At least I would expect a deref to be necessary:

*values.column_mut(0) += 1.;

I reckon this alternative could be made work today with sufficiently smart use of DerefMut. For starters, here's an extension method returning a mutably dereffable proxy type that in turn allows you to write a single prefix * only: playground

(To be clear, I'm not suggesting that every x_mut() method of Array should be duplicated! I'm saying that with a similar approach, the views could be made smarter to allow for the appropriate dereffing. Maybe just allow array views to deref to themselves…?)

bluss · May 23, 2020, 5:26pm

I really appreciate to get an example of a workaround, and that will be remembered as an alternative.

It seems like a clever hack, but it does not seem right that DerefMut should be implemented, especially not an impl that just derefs back to Self; this is not a reference type or a smart pointer, but a view; Self-deref is a weird case so I'd be afraid Rust might as well deprecate it in the future.

I don't think that values.column_mut(0) += 1.; is unreasonable when this already works:

let mut col = values.column_mut(0);
col += 1.;

Dereferencing with * is used to change the logical type (for example x: &mut T to *x: T) and we would break that logic with the workaround.

CAD97 · May 23, 2020, 5:50pm

Hey @myrrlyn, do you want to come advertise how you've done this for bitvec?

To be frank, I think ndarray shouldn't continue using ArrayBase the way it is, where you have Array = ArrayBase<Owned<_>, _>, ArrayView = ArrayBase<View<_>, _>, ArrayViewMut = ArrayBase<ViewMut<_>, _>, and so on, if it wants ideal rust-integrated sugar/semantics.

Instead, Array should use normal ownership rules:

type Array<A, D> = Box<RawArray<A, D>>
type ArcArray<A, D> = Arc<RawArray<A, D>>
type ArrayView<'a, A, D> = &'a RawArray<A, D>
type ArrayViewMut<'a, A, D> = &'a mut RawArray<A, D>
type CowArray<'a, A, D> = Cow<'a, RawArray<A, D>>

"How are you going to do striding views, then?" you are probably asking. The answer is by (ab)using fat pointers (link is to a quick/dirty externalized example for column_mut). You have something like

#[repr(C)] // transparent with `[()]` as the ZST
struct RawArraySlice<A, D> {
    base: RawArray<A, D>,
    unsize: [()],
}

which then means &'a RawArraySlice<A, D> is a fat reference to RawArray<A, D> where you can stuff whatever usize-sized payload into the fat pointer you want without causing any problems.

Yes, I understand there are further restrictions on ndarray that make this technique not a simple slot in (such as the fact that OwnedRepr is basically Vec<_>/Box<[_]>, so there's already an unsized part), and would certainly involve some duplication of definitions from ArrayBase. But if it can be made to work, I definitely think it will on average lead to much better ergonomics than having the ownership information internal to the ArrayBase type.

HeroicKatora · May 23, 2020, 6:00pm

If we had also user-defined pointer metadata/custom DSTs, I concur. The method would then return a specialized reference kind that points to a wrapper around [T] and implements the operator. That is indeed what might most cleanly capture the intended semantics once and if stabilized. However, given that column_mut does not return a pointer type, this is the way the operator can be overwritten, though, and with the regular alternative in two statements given by

H2CO3 · May 23, 2020, 6:15pm

Views are conceptually reference types though, albeit obviously custom view types are not the same as e.g. Rust's reference-to-slice type, &[T].

How would you deprecate it? Deref<Target=Self> is a perfectly valid although most often not very useful implementation. Actively prohibiting it would require yet another special case in the language – please, let's not go there, regardless of its usefulness in this case or others.

Either way, I don't appreciate why implementing Deref for something that is a view into an owned object wouldn't make sense.

There's a big difference between col and values.column_mut(0);: the former is a place expression ("lvalue") while the latter is a temporary ("rvalue"). What you are saying is akin to arguing that 0 += 1; should compile because let mut x = 0; x += 1; works.

H2CO3 · May 23, 2020, 6:16pm

I'm sorry, what do you mean by pointer tags?

bluss · May 23, 2020, 6:45pm

The information a one-dimensional array view stores is one pointer, one size and one stride for 3 usizes worth of data in total; this doesn't fit into a pointer and an usize payload.

(For a two-dimensional array view, that is one pointer and two sizes and two strides for 5 usizes worth of data in total, etc..)

Furthermore dynamic dimension array views store one pointer, a (small)vector of sizes and a (small)vector of strides; this is not Copy so is not generally compatible with any custom DST scheme.

It's a really interesting topic anyhow; I've made PoC's like that that store the 2d view information as 2 sizes and 2 strides in a u64 (usize), and while that works, it does not support big enough arrays and is not something that I'd ship - unclear to me if it's even valid in Rust.

josh · May 23, 2020, 6:57pm

Assignment operators allow indexing on the left-hand side. Could you create a columns() method returning a helper type that allows this to work?

values.columns()[0] += 1.0;

If this works, then you could also support more complex indexing as well:

values.columns()[0..=1] += 1.0;

HeroicKatora · May 23, 2020, 7:10pm

With tag I meant the extra dynamic pointer metadata added by custom DSTs as proposed in this RFC.

bluss · May 23, 2020, 10:07pm

It's possible that this would work for columns! (We do get into the DST hacks again with this - and there I'd be interested to look at sources that state what it's valid to do with fat pointer metadata)

The column views in the original post are my go-to example, but in general for ndarray we would like this to apply to all array view access. So my general problem extends to more than columns - anything that returns a view, also if it would be a two-dimensional view, like this:

// Increment the values in the upper 2 x 2 corner of the matrix
values.slice_mut(s![..2, ..2]) += 1.;

As you can see we have our own slicing syntax to support multidimensional slicing.

bluss · May 24, 2020, 9:04pm

I'm trying out suggestions and deref-to-self is unfortunately unworkable and leads to this (very many times over):

error[E0055]: reached the recursion limit while auto-dereferencing `ArrayBase<ViewRepr<&A>, D>`
   --> src\arrayformat.rs:127:37
    |
127 |     format_array_inner(array.view().into_dyn(), f, format, fmt_opt, 0, array.ndim())
    |                                     ^^^^^^^^ deref recursion limit reached
    |

H2CO3 · May 24, 2020, 9:18pm

Yeah, that's exactly one of the things where it can go wrong. Still not sure why this is relevant to its deprecation though?

lordan · May 25, 2020, 12:44pm

This looks to me like another case where method-style macros would be nice:

values.slice_mut![..2, ..2] += 1.;

Topic		Replies	Views
Thought: OpAssign operators on non-lvalues language design	3	933	March 25, 2019
Nascent Idea: allow `.0`, `.1`, etc on arrays language design	32	2430	May 4, 2019
View types based on pattern matching language design	1	1041	September 21, 2022
Pre-RFC: Assign `Copy` types to `Cell` without `set()`	14	1575	May 12, 2022
More musings about slices and arrays language design	13	1429	January 3, 2021

Assign ops papercut with custom view types

Related topics