Add operator @ for matrix multiplication

Heh. I actually genuinely enjoyed writing in Agda, where you can - and idiomatically do - make custom infix operators out of any Unicode math symbol you want, but I would absolutely not try to insert that as a feature into one's average programming language. :joy:

9 Likes

I do not own, and do not plan on owning, a Optimus Maximus for programming purposes! :rofl:

Easy:

  • tensor product: v ⊗ w
  • cross product: v × w
  • ordinary multiplaction, but more fancy: a ⋅ b

Hard to type? Not that bad, IDEs will handle it eventually, and also you’re reading code way more often than writing it anyways, so it doesn’t matter if it’s hard to type. /s


Edit: For exponentiation better keep using a method. (Proper) superscript text is not something Unicode supports (yet…)

9 Likes

Hah, I would totally buy one of those provided I could have it in a split ergonomic form factor, maybe ZSA would be interested in dusting off the concept...

1 Like

I don't know what the current price is (they're sold out, so not listed), but IIRC when they first came out they were about USD $1100. As cool as they are I couldn't justify it. That, and like you implied, since they aren't ergonomic, I'm not interested.

Yeah, I can imagine how many tiny expensive parts that would have needed and how finicky it must have been to assemble.

I don't think I'd actually propose this, but I wouldn't mind having x² work. (Realistically, though, .powi(2) is fine.)

3 Likes

Yes. I think the main issue here is the fact, that the " * " operator does not nessesarily have the same semantics as the multiplication sign in programming languages. I've never seen the mathmatical dot sign used to describe elementwise multiplication, whereas this appears to be quite common in programming. There is a conflict between matrix/inner and elementwise multiplication, so you need to inspect you type to reason about, what is actually used here. I personally fell it is approprate to map matrix/inner products to "*", because it is always unclear, which one is taken.

Yes, definatly a point can be made, that Rust is not Python. One important argument raised during the "@" introduction, was that Python is often used by scientists with very limited programming experience and that matrix operations are more prevalent them, bitwise operations in Python. Of corse duck typing also plays a big role there. It should also be noticed, that Rust has no "raise to the power of" operator, while Python has. Non of these arguments apply to a non-science focused systems programming language like Rust, although in theory one could imaging a trait abstracting over all matrix types.

The nice thing about having two different multiplication operators, e.g. " * " and "@" would be, that it would make the distinction clear: " * " would simply not be defined.

All in all, I personally would like the idea of having a dedicated dot product/matrix multiply operator and "@" seems to be the most obvious choice. I can certainly see some downsides of this, but it would definetly make scientific code easier to understand and also give one extra operator for other cases to use for relativly little cost. Tensor products do not need an extra operator, as some notation like "v @ v.T" also does the job here and calculus with more them 2 dimensional objects should best be written using some "einsum" macro, because things would get confusing otherwiese. The cross product is non-commutative, so a two parameter function is just as good as an operator.

I think just introducing one extra operator would also be a much better option then custom operators. These would have a huge cost by destroying the context free nature of the Rust grammar for maybe one or two applications.

// Are trigraphs available??/

4 Likes

No. They are not. :laughing:

1 Like

k#mat_mul. There is no RFC, but it is mentioned on the edition page:

  • k#keyword to allow writing keywords that don't exist yet in the current edition. For example, while async is not a keyword in edition 2015, this prefix would've allowed us to accept k#async in edition 2015 without having to wait for edition 2018 to reserve async as a keyword.

The RFC is for reserving the form ident#ident (along with ident"..." and ident'...').

1 Like

That was https://github.com/rust-lang/rfcs/pull/3098#issuecomment-949047674, which was closed since the critical part for the edition was reserving the lexical space, not defining a specific way to use some of that space.

2 Likes

For typing symbols directly, compose keys are an excellent solution. I don't have € on my keyboard but typing <compose> C = does the job. This can get quirky quickly.

So a better solution at the language level is having common escape sequences, translated by rustfmt/rustc/IDE. I don't have ⊂ on my keyboard but (in lean) \subset gets translated just fine.

1 Like

I can't call it "obvious" by any measure.

The very fact that 2D arrays can be interpreted in several different ways, and that a different kind of multiplication makes sense for each of those interpretations, makes "the" multiplicative operation highly ambiguous by default.

I work with linear algebra a lot, and I have several years of experience in Python and the SciPy ecosystem, and I still have no idea which of the two most prominent semantics to expect from a "new and improved" N-dimensional array/tensor/matrix/vector library that provides "the" multiplication operator.

In this light, I find these operator-related proposals quite uncomfortable, and I find some people's obsession with infix symbols equally incomprehensible. It seems to me it should be clear that such levels of ambiguity are bad, and we should be fighting ambiguity rather than purposefully making the situation worse over time.

I also think that the judgement based on purely surface-syntax-level "beauty" (i.e., an infix operator is easier on one's eyes) is just misguided, because it completely fails to consider that understanding semantics is also part of the readability of code, and yeah, I would much rather see foo.matrix_mul(bar) and foo.element_mul(bar) even though they are longer, because they ultimately communicate the intent significantly better.

1 Like

In this context I'd just like to mention that other old suggestion for some syntax that would allow functions to be used in infix form, e.g.:

let x = a `matmul` b;
let y = a `dotmul` b;

That IMHO would be much more readable and avoid ambiguity. Although it would still not answer the question about precedence and associativity.

You can get close to that syntax already with traits and with impl blocks for structs/enums:

let x = a.matmul(b);
let y = a.dotmul(b);

which is also unambiguous. Which leads to the question; what would you desugar your example into? If it desugars into my example, then the benefit is small. If something else, we'd need to evaluate the cost/benefit tradeoff there.

6 Likes

The issue with this is that you need memorize those key sequences. That is a very high barrier to entry for new rustaceans. Consider LaTeX sequences (I think that's where you got \subset from). Here is the Comprehensive Latex Symbol List. It's 105 pages long, and even if 3/4 of it is just explanatory text, that still leaves 75 pages of symbols and escape sequences to select over. I don't want to have to memorize sequences to program, especially sequences/keywords/etc. that I don't use regularly.

My personal rule of thumb is that if at least 10% of the code uses it, or if there is no way to achieve the effect via functions or macros, then it can be assigned a keyword or symbol. Otherwise I'm very hesitant with something like this.

If you absolutely must have @ as a symbol, there is a way of doing it, but it is a seriously ugly hack. Write a macro that is able to parse your code and rewrite it use mat_mul(). The interface would look something like the following (I am barely literate when it comes to writing macros, if this syntax wouldn't work, someone please correct me).

// Name comes from the Unix 
// [sed](https://www.gnu.org/software/sed/manual/sed.html)
// program
sed!(s, a @ b, a.(b), g, // Not real regex! Legibility purposes..
    // All of your code here
)

The output would be exactly what you expect, and (modulo how complex your regex is) end users would be able to mentally parse your code into the right thing.

Please don't do this though; I don't have a regex engine in my head that can translate this into something sensible on the fly, especially if you choose to nest such regular expressions within one another.

1 Like

As a point of comparison (I agree that Rust should absolutely not do this), Agda's tooling has a command specifically for looking up the shortcut to write a symbol you see in the code you're reading. Modern IDE support would probably display this information in a tooltip when you mouse over a symbol, and likewise display a suggestion-box of likely symbols when you start writing a symbol by typing \. Tooling like this significantly reduces the barrier, but only reduces it – it's still a heavy price to pay. The price is worth it for Agda, which expects its users to be doing heavy mathematics a lot of the time, but wouldn't be worth it for one's average programming language.

2 Likes

Just to clarify: this is not my proposal. It is something that was suggested a long while ago in a similar thread, to pretty much the same counter-argument as yours. I merely wanted to highlight that there are options other than cryptic operator symbols. (another example: this classic paper)

The problems with precedence and associativity are readability & ambiguity problems. That's probably the biggest reason to avoid supporting arbitrary infix operators, and why method call syntax is usually preferable.

6 Likes