Implicit buffers, overloading the assignment operator


#1

Let’s start with something trivial. Here is a point, I would like to make why assignment operator overloading could be useful. It’s only about ergonomics. And maybe it does not justify, as problems emerge that cannot be solved by overloading alone.

Suppose we want to write the result of a multplication matrix*vector into a buffer. One would write it this way:

let y = &mut [0;4];

let x = &[1,2,3,4];

let A = &matrix!(
    [1,0,0,0],
    [0,1,0,0],
    [0,0,1,0],
    [0,0,0,1]
);

mul_matrix_vector(y,A,x);

But if the multplication A*x returns some wrapper

    struct MulMatrixVec{A: &Matrix<i32,4,4>, x: &[i32;4]}

and one could overload the assignment operator, it would be possible to write:

    y = A*x;

So, that’s half baked. In general it would be nice to build some AST on the right hand side at compile time and then flush it into a buffer by the best combination of BLAS operations, maybe with some BLAS extensions.

Now, some value shall be an argument to a function. One would first write it into some buffer. Again, writing things inline is not possible or unelegant, as expressions would become cluttered with buffers. The general problem emerges, how buffers could be stated implicitly.

Maybe by introducing an extension language:

    la!{y = f(A*x);}

That could be too much for procedural macros, as an AST with type information might be needed.

Otherwise readability suffers a lot:

    mul_matrix_vector(buffer,A,x);
    f(y,buffer);

Abstracting from linear algebra, the problem emerges, how buffers could be stated implicitly, or how pretty expressions could be stated that compile into alloaction free code.

I have the Idea, that a buffer could be bound as a closure binding, i.e. by partial application. But again, what comes out will possibly be half baked, complicated and clumsy.


#2

Maybe there’s a way to do something inspired by

Then you can have A * x return some sort of multiplication proxy class that would itself be Into<Vector>, so the whole code would be something like y = (a*x + b).into();, with nothing actually evaluated until the .into(). Rust should make the much nicer, in fact, since the borrow checker can keep you from accidentally keeping things borrowed too long, which is a hazard for C++ expression templates when you try to do things like like x = a*x;.


#3

Overloading the assignment operator and adding C++ -like constructors have all been discussed in the past.

One of Rust’s fundamental building blocks is that assignment is nothing more than a move (via memcpy or the like). I think giving up this property would be way too much of a price to pay for some specialized use cases that would be somewhat easier to write.

That in itself doesn’t need overloading =. It only needs overloading operators that correspond to the RHS, which are probably already overloadable today (multiplication, addition and subtraction being the most common operations in linear algebra, so you could do this today.) You can then perform complex symbolic manipulations by making these operators return basically AST builders or partial ASTs or something along these lines. Then, in the end, you would indeed need one explicit e.g. function call to put the result in a buffer. (Not necessarily From/Into as they force a return-by-value, and one usually uses buffers if one wants to avoid intermediate allocations or reuse an existing allocation).

So you don’t get everything with this approach, but you get almost everything. And that one call at the end shouldn’t be too much of a burden – while it also has the added value that you see where the actual computation happens (and e.g. where the real performance bottleneck might be). (Conversely, overloading = would mean that for every assignment in the code, you would now wonder if it’s a simple memcpy or something which will take half a day to run and launch the nukes.)


#4

I think perhaps this is the bit we should focus on solving… Could we make macros 2.0 or proc macros more like tactics in some way such that you could get the type information? Needless to say, this is far from a trivial problem… you need to start thinking about phase ordering of the compiler. Maybe there needs to be an opt-in? (to avoid slowing down all other macros that don’t need it…). But this problem has been solved before, for example in theorem provers or in programming language Idris.

Agreed; As nice as overloading = might be for self-referential structs or something like that, one of the things that Edward Kmett raises in Type Classes vs. the World is that Coherence removes a burden to reason globally about things and to be in constant fear about incoherent behavior. I.e. you need to think about the provenance of a Set wrt. its ordering. I think overloading = is similar in respect, it weighs down reasoning globally and you need to live in fear when you see place = expr;


#5

That would be great. I suspect there are several similar problems that would benefit from access to type information. (I certainly remember running into one while implementing a derive, but I can’t recall anymore what it was…)


#6

Absolutely! I think for example that if I could determine whether an enum will reference itself co-recursively, then I could have #[derive(Arbitrary)] (soon in proptest) work for such types so that you more or less never have to write out Arbitrary impls manually or do the dance with composing Strategys for the type.