Assign ops papercut with custom view types

The assign ops (Like +=, *= and so on) require the receiver to be a place and this leads to awkwardness with composite types like matrices and views thereof.

In ndarray we can use a method like .column_mut(0) to get a mutable view of the first column. Basically this is the multidimensional array version of making a mutable slice. Arrays and array views support arithmetic operations, so we'd like to be able to use the += operator.

What we want to write (this doesn't compile!)

values.column_mut(0) += 1.

What we have to write:

*&mut values.column_mut(0) += 1.;

(See this playground link - full example)

Can we do something about the rules for assignment operators to solve this papercut?

Users are unlikely to even find the one-liner solution to this, and the other one with a separate variable is also a papercut-type problem.

On the wishlist would be that

A. we can define += so that it can be used on temporary views like this.
B. Further, even on temporary views that are not even mutable. We want to fully support ArrayView<Cell<f64>, _> in arithmetic operations in the future - we definitely want to support += on shared views of Cell elements.

Assign ops on temporaries are misleading and don't always make sense:

string.len() += 1;
// What? Allowing this is nonsense and doesn't do what you'd think

So a rule would be needed that could allow this for view-type temporaries while still excluding other kinds of receivers, like the return value of String::len.

1 Like

I don't see what the difference is between a view type and a normal temporary. At least, not a machine checkable difference.

1 Like

IIUC this only makes basic conceptual sense if the "view type" returned by the method contains &mut Ts to some Ts inside a fully owned non-view type elsewhere. So cases like string.len() seem fairly easy to exclude or at least lint against on the basis that len() returns a usize, and it makes no sense to try and mutate a temporary unnamed usize since usize doesn't contain any &muts.

Well, to solve the problem we can find out what changes are necessary to make to Rust, and make them. So if some kind of opt-in is needed, that could be implemented, for example. I'm open to any ideas about what would be a good way to solve the papercut.

1 Like

&mut or & to interior mutable data conceptually yes :slight_smile:

With that said, ArrayViewMut does not contain any mutable references. It's implemented using raw pointers. (Edit: Ok, my bad, it does contain a PhantomData<&mut T>, so maybe that works!)

This could work, maybe some marker trait or attribute? If we go the attribute route, we should allow assign ops on temporaries, but lint against it unless a type has the marker. If we go the trait route, we could make assign ops work on temporaries whose type implements the marker trait.

I think I lean more towards an attribute, something like #[place_type] (name bikeshed), where any type that has this attribute can use the sugar, temporary assign_op value; to mean the same as *&mut temporary assign_op value or *&temporary assign_op value depending on context.

I have a problem with the basic premise:

// I wish this could be written:
values.column_mut(0) += 1.;

I find this absolutely confusing. Exactly because it's assigning to a temporary. I don't think this should be allowed at all. To me, allowing this would be a strangeness paper cut.

At least I would expect a deref to be necessary:

*values.column_mut(0) += 1.;

I reckon this alternative could be made work today with sufficiently smart use of DerefMut. For starters, here's an extension method returning a mutably dereffable proxy type that in turn allows you to write a single prefix * only: playground

(To be clear, I'm not suggesting that every x_mut() method of Array should be duplicated! I'm saying that with a similar approach, the views could be made smarter to allow for the appropriate dereffing. Maybe just allow array views to deref to themselves…?)

2 Likes

I really appreciate to get an example of a workaround, and that will be remembered as an alternative.

It seems like a clever hack, but it does not seem right that DerefMut should be implemented, especially not an impl that just derefs back to Self; this is not a reference type or a smart pointer, but a view; Self-deref is a weird case so I'd be afraid Rust might as well deprecate it in the future.

I don't think that values.column_mut(0) += 1.; is unreasonable when this already works:

let mut col = values.column_mut(0);
col += 1.;

Dereferencing with * is used to change the logical type (for example x: &mut T to *x: T) and we would break that logic with the workaround.

1 Like

Hey @myrrlyn, do you want to come advertise how you've done this for bitvec?

To be frank, I think ndarray shouldn't continue using ArrayBase the way it is, where you have Array = ArrayBase<Owned<_>, _>, ArrayView = ArrayBase<View<_>, _>, ArrayViewMut = ArrayBase<ViewMut<_>, _>, and so on, if it wants ideal rust-integrated sugar/semantics.

Instead, Array should use normal ownership rules:

  • type Array<A, D> = Box<RawArray<A, D>>
  • type ArcArray<A, D> = Arc<RawArray<A, D>>
  • type ArrayView<'a, A, D> = &'a RawArray<A, D>
  • type ArrayViewMut<'a, A, D> = &'a mut RawArray<A, D>
  • type CowArray<'a, A, D> = Cow<'a, RawArray<A, D>>

"How are you going to do striding views, then?" you are probably asking. The answer is by (ab)using fat pointers (link is to a quick/dirty externalized example for column_mut). You have something like

#[repr(C)] // transparent with `[()]` as the ZST
struct RawArraySlice<A, D> {
    base: RawArray<A, D>,
    unsize: [()],
}

which then means &'a RawArraySlice<A, D> is a fat reference to RawArray<A, D> where you can stuff whatever usize-sized payload into the fat pointer you want without causing any problems.

Yes, I understand there are further restrictions on ndarray that make this technique not a simple slot in (such as the fact that OwnedRepr is basically Vec<_>/Box<[_]>, so there's already an unsized part), and would certainly involve some duplication of definitions from ArrayBase. But if it can be made to work, I definitely think it will on average lead to much better ergonomics than having the ownership information internal to the ArrayBase type.

1 Like

If we had also user-defined pointer metadata/custom DSTs, I concur. The method would then return a specialized reference kind that points to a wrapper around [T] and implements the operator. That is indeed what might most cleanly capture the intended semantics once and if stabilized. However, given that column_mut does not return a pointer type, this is the way the operator can be overwritten, though, and with the regular alternative in two statements given by

Views are conceptually reference types though, albeit obviously custom view types are not the same as e.g. Rust's reference-to-slice type, &[T].

How would you deprecate it? Deref<Target=Self> is a perfectly valid although most often not very useful implementation. Actively prohibiting it would require yet another special case in the language – please, let's not go there, regardless of its usefulness in this case or others.

Either way, I don't appreciate why implementing Deref for something that is a view into an owned object wouldn't make sense.

There's a big difference between col and values.column_mut(0);: the former is a place expression ("lvalue") while the latter is a temporary ("rvalue"). What you are saying is akin to arguing that 0 += 1; should compile because let mut x = 0; x += 1; works.

I'm sorry, what do you mean by pointer tags?

The information a one-dimensional array view stores is one pointer, one size and one stride for 3 usizes worth of data in total; this doesn't fit into a pointer and an usize payload.

(For a two-dimensional array view, that is one pointer and two sizes and two strides for 5 usizes worth of data in total, etc..)

Furthermore dynamic dimension array views store one pointer, a (small)vector of sizes and a (small)vector of strides; this is not Copy so is not generally compatible with any custom DST scheme.

It's a really interesting topic anyhow; I've made PoC's like that that store the 2d view information as 2 sizes and 2 strides in a u64 (usize), and while that works, it does not support big enough arrays and is not something that I'd ship - unclear to me if it's even valid in Rust.

Assignment operators allow indexing on the left-hand side. Could you create a columns() method returning a helper type that allows this to work?

values.columns()[0] += 1.0;

If this works, then you could also support more complex indexing as well:

values.columns()[0..=1] += 1.0;
3 Likes

With tag I meant the extra dynamic pointer metadata added by custom DSTs as proposed in this RFC.

1 Like

It's possible that this would work for columns! (We do get into the DST hacks again with this - and there I'd be interested to look at sources that state what it's valid to do with fat pointer metadata)

The column views in the original post are my go-to example, but in general for ndarray we would like this to apply to all array view access. So my general problem extends to more than columns - anything that returns a view, also if it would be a two-dimensional view, like this:

// Increment the values in the upper 2 x 2 corner of the matrix
values.slice_mut(s![..2, ..2]) += 1.;

As you can see we have our own slicing syntax to support multidimensional slicing.

1 Like

I'm trying out suggestions and deref-to-self is unfortunately unworkable and leads to this (very many times over):

error[E0055]: reached the recursion limit while auto-dereferencing `ArrayBase<ViewRepr<&A>, D>`
   --> src\arrayformat.rs:127:37
    |
127 |     format_array_inner(array.view().into_dyn(), f, format, fmt_opt, 0, array.ndim())
    |                                     ^^^^^^^^ deref recursion limit reached
    |

Yeah, that's exactly one of the things where it can go wrong. Still not sure why this is relevant to its deprecation though?

This looks to me like another case where method-style macros would be nice:

values.slice_mut![..2, ..2] += 1.;
1 Like