Add operator @ for matrix multiplication

pipehappy · January 26, 2022, 8:47am

Python chooses @ for matrix multiplication. How about also giving it to rustaceans?

H2CO3 · January 26, 2022, 8:50am

This has been discussed before, please use the search. The short answer is that it would have more drawbacks than benfits.

scottmcm · January 26, 2022, 8:51am

Can you speak to why you think it's needed? What's wrong with overloading * to work with matrices too? We have static types, so we don't need different operators for different types.

pipehappy · January 26, 2022, 9:11am

@H2CO3 I see there is Add a matrix multiplication operator like Python? about mat_mul is not commonly used. Would you please elaborate on the drawbacks?

@scottmcm

Can you speak to why you think it's needed?

Why not? For Python people switching to Rust, it's a welcome sign.

What's wrong with overloading * to work with matrices too?

Maybe a half-second hold back for people new to a library, is it an element-wise or matrix multiplication?

scottmcm · January 26, 2022, 9:14am

This is an argument for adding every popular feature from every language ever, and thus isn't persuasive for anything.

toc · January 26, 2022, 9:18am

Even with normal linear algebra matrices per-element-pair multiplication is an important operation, and thus one must still pick between "dot multiplication" and "matrix multiplication".

I agree that wanton operator overloading can help create unreadable messes, but esoteric operators (perhaps ones that you cannot see a use for) can be extremely helpful in growing a language and making it useful and convenient for sometimes very specific purposes. I personally can see a use for really quite a lot of operators, and I use decent portions of that list on a regular basis.

steffahn · January 26, 2022, 9:22am

Is that so? I don't remember coming across this operation over a 2 semester linear algebra course.

I'd be interested what its applications are (for 2 dimensional matrices specifically).

pipehappy · January 26, 2022, 9:22am

Err. I'm confused here. If you can add every popular feature together and they fit well, then it's probably a good one.

jhpratt · January 26, 2022, 9:34am

The burden for new features is generally on those requesting them. Keep in mind that Rust is its own language. Python having an operator is not sufficient reason for Rust to implement it, no matter how "popular" it is there.

Why do you believe this is a common enough operation to warrant its own operator, and why do existing options not work?

CAD97 · January 26, 2022, 10:17am

@pipehappy helpful note: you can highlight text in a post, and then there will be a Quote button pop up which will insert a quote of the text you're replying to (like in @jhpratt's post) to make it clear what part of your post is the quote you're replying to, and what part is your reply. Alternatively, you can start a line with > to insert a block quote without a forum quote.

Also interesting to note is that there isn't a matrix type in std, so adding @ as an operator would be adding an operator without any implementations on std types.

What you want to search for to see the downsides @H2CO3 alludes to is the more general idea of "custom operators." The key arguments against custom operators are that they make it harder to read code when using operators not drilled into our heads since grade school (this is arguable; well used/known operators in a specific domain can make certain things easier to read), but also specifically about binding strength / precedence.

With the standard arithmatic operators std defines, we (mostly, barring horrible gotcha posts that are deliberately ambiguous to gain interactions and go viral) agree on the order of operations between them. For custom operators, you either need to:

have no precedence, and always require bracketing, removing the benefit of using operators;
allow the custom operator impl to define binding strength, and somehow deal with that in the parser; or
pre define a binding strength (probably just straight left-assosciative) for custom operators and just be wrong for a number of them.

And defining precidence isn't even good enough, because you can still get multiple valid parses, if you allow to define the same operator over different types with different associativity/binding strength. (Orphan rules may prevent this from being problematic? I'm really not certain.)

Consider as an example, A @ B @ C with A @ B => T, B @ C => U, T @ C => Q, and A @ U => Z. Either (A @ B) @ C or A @ (B @ C) are valid parses, depending on the different associativity and binding strength where all of these operators are defined.

So, we pull back from "full custom operators" to just some small defined set with defined precidence and associativity... but... whose? That's the other problem with operators beyond the basic ones: there's less than no practical agreement on what their associativity and relative binding strength is (and far from agreement on usage).

Adding one operator on its own can seem harmless enough, but as soon as you ask "why this operator and not others" then complications come from every direction. And you can't pick a reasonable subset of operators (where e.g. for identifiers, Unicode defines what an identifier should be defined as),^[1] beyond the simple mathematic operators we already have, without playing favorites between equally reasonable options.

So the burden falls on proving that adding something new to the language is worth the extra cost of teaching it, and with the downsides of not having a clearly best design, it's very hard to pass.

Actually, the same UAX#31 defines PATTERN_SYNTAX as a potential set of ~~characters~~ codepoints to be used for syntactical elements in a code language, to leave all others as non-semantic to the language. This can be used to define e.g. what codepoints are allowed in custom operators in a mostly natural-language-agnostic manner, but does nothing to resolve the inherent difficulty of actually supporting custom operators without a) them being abused and b) having to deal with all of the issues of associativity and binding strength.

For a note, Haskell, which uses custom operators a lot, created its own search engine to search for its custom operators. That's another big downside of custom operators: searchability in conventional search engines is next to none. ↩︎

pipehappy · January 26, 2022, 11:18am

@CAD97 Thanks! This is quite informative.

Also interesting to note is that there isn't a matrix type in std, so adding @ as an operator would be adding an operator without any implementations on std types.

This looks like an ecosystem perspective and depends on how you vision it. There are Matlab, Mathematica having it builtin; there are implementations in libraries, like NumPy.

The key arguments against custom operators are that they make it harder to read code when using operators not drilled into our heads since grade school

Agree. Something that may help here is the assumption that people will not use hard-to-read code on a daily basis.

also specifically about binding strength / precedence.

Under specific use-case in-mind, can we just do like straight left-assosciative.

beyond the simple mathematic operators we already have, without playing favorites between equally reasonable options.

No, I feel there is no equally options. Someone is favored, others are not.

So the burden falls on proving that adding something new to the language is worth the extra cost of teaching it, and with the downsides of not having a clearly best design,

Using @ not something like .* is a reduction in education cost.

Python having an operator is not sufficient reason for Rust to implement it, no matter how "popular" it is there.

Agree. The Python part is for why '@' may be an option but others.

Why do you believe this is a common enough operation to warrant its own operator, and why do existing options not work?

I don't see it's common enough or not. I don't have numbers. Let's say the 'why' part like this:

Adding operator @ allows an implementation to support both element-wise and dot product at the operator level in the same unit of code.

scottmcm · January 26, 2022, 11:39am

This gets to CAD's point, though: Why stop there? If element-wise and dot product are worth having operators, why not cross product too? Or tensor product?

And the calculus are different for a unityped language like Python and a multityped language like Rust. In Rust we can have https://doc.rust-lang.org/nightly/std/simd/struct.Simd.html#impl-Mul<Simd<f32%2C%20LANES>> for element-wise, for example.

pipehappy · January 26, 2022, 12:21pm

This gets to CAD's point, though: Why stop there? If element-wise and dot product are worth having operators, why not cross product too? Or tensor product?

If you want to go ahead, sure you can. But do you?

CAD97 · January 26, 2022, 12:43pm

"Common grade-level operators" is a strongly supported stopping point for adding a fixed list of operators into the language.

If you want to add any more, you need to provide why the new set of operators is more justified than this preexisting set. (And you also need to provide why it's better than "your set, but with my least favorite new operator swapped out for my favorite.")

pipehappy · January 26, 2022, 1:56pm

From the dialog, it seems there are two parts about the discussion:

Can we have one more operator in std::ops, intend to support dot and element-wise product, together with std::ops::Mul?

The current spec sees * as the arithmetic multiplication. By least surprise, tensor library author may overload std::ops::Mul as the element-wise product as well. Meanwhile, dot product is also common operation may want to overload std::ops::Mul too. Adding one more operator allows these two operations having their own symbol to overload. Without it, people will fall back to function call.

One discussion is about popularity, if no one use it then don't do it. The argument is in two folds. First, there are languages supporting both product at operator level, like Python, Matlab, and Mathematica. One common label can see here is that they are math-model heavy. second part is does Rust fit here? I think Rust fits and people will use it.

The other discussion is about other tensor operations. It's an subjective topic, but there is common behavior and expectation. People feels sugar-syntax two kinds of product is enough, suggested in other implementation.

Another concern is about the use case beside multiplication, which may cause hard-to-read code. Readiness has higher priority these days, hard-to-read code won't be popular. If there is hard-to-read code is popular, then that's a sublanguage.

That one more operator can be @.

The bottom line is other symbol will fit. But 1. the @ is already in the language and its description is in the bottom section of that chapter in the book, 2. it's used as dot product in other language, thus less surprise.

elidupree · January 26, 2022, 2:24pm

To my mind, the principle of least surprise suggests that neither form of multiplication should use *, because either a reader or writer could mistakenly assume which form of multiplication it means.

steffahn · January 26, 2022, 2:24pm

For the record, @ already has a meaning in Rust, as a binary operator in patterns. The fact that this meaning has nothing to do with matrix multiplication is rather an argument against using @.

Also note that you seem to be using the term “dot product” wrong. It’s IMO rather unfortunate that python’s .dot method, as well as Rust’s ndarray use a method called “dot” for matrix multiplication (even though, for two vectors, the dot product [aka “scalar product”] can be the same as matrix multiplication, if the left one is made into a row-vector and the second into a column-vector).

Also note that I’ve personally never seen “@” being used for matrix multiplication before in any programming language that I knew (I obviously don’t know Python all that well).

elidupree · January 26, 2022, 2:31pm

It's even worse than that! The dot product has a scalar output, while matrix multiplication between a row vector and a column vector produces a 1x1 matrix, which, for us programmers, is different than a scalar. Moreover, if you multiply together two 1x1 matrices, all 3 options are valid (dot product, element-wise multiplication, and matrix multiplication...)

steffahn · January 26, 2022, 2:44pm

I know. I didn’t want to go into this level of detail . I had a professor for linear algebra who cared a lot about differentiating things like matrices from vectors or functions. Linear transformations are not the same as matrices, 1×n matrices or m×1 matrices are not the same as vectors. It’s often formally inconsistent or at least quite nontrivial to “identify” things in ways it’s often done in mathematics.

Still, it is nonetheless common do these “identifications” in mathematics, and it’s not really problematic if it just means that you treat these things as “almost the same” and mean to have implicit conversions between them when it’s necessary; but this approach only works when there’s a human around that can “intuitively” tell you where implicit conversions are necessary, it’s the same as lots of other ambiguities in standard mathematical notation. If you do want to apply mathematical notation/convention to programming, you need to disambiguate by introducing more/different notation, and/or by carefully overloading based on consistent and unsurprising rules, and there’s often multiple ways to design a system like this.

I don’t know how Python handles it, but I wouldn’t be surprised if, through the magic of dynamic typing, 1×1 matrices would automatically become scalars.

For ndarray, there’s separate vector and matrix types, so the overloaded dot operation is just a trait operation that does vector-vector multiplication yielding a scalar, or matrix-vector or vector-matrix multiplication yielding a matrix, or matrix-matrix multiplication.

Note that if you do interpret 1x1 matrices as basically-the-same as scalars, then while of course

the three operations also produce the same result.

I do consider element-wise multiplication as quite unnatural if you’re working with actual matrices in the context of linear algebra, not just with “arbitrary” multi-dimensional array data. In this sense, I do find it quite logical how ndarray makes * be element-wise on its Array2 type, while nalgebra uses matrix multiplication on its DMatrix type, and other Matrix<…> types (all are two-dimensional, but many are fixed-size).

ckaran · January 26, 2022, 7:54pm

Everyone else has made really good points on why this is probably not a good idea, so I'm going to come at the problem from a completely different angle: how common is @ on keyboards world-wide?

I ask because of the problems that APL ran into that were solved by making special keyboards so you could type the symbols without having to memorize what keys mapped to what operators. The @ symbol is probably a safe bet if for no other reason than everyone's email address has the symbol in it, but assuming that your proposal was accepted, which symbol is used for the next operator? And the one after that? Etc.?

If you decide to give up @ and other single character operators, you might use something like #k"mat_mul"¹ to create an operator, but at that point I'd really prefer to see a.mat_mul(b).

Also, please don't say 'Unicode'. Yes, I know that every possible math symbol is going to be encoded in there somewhere, but if it isn't on my keyboard, I don't want to deal with it.

¹There was an RFC at one point about reserving a namespace for new keywords, but I can't find it. I hope I got it right!

Topic		Replies	Views
Add a matrix multiplication operator like Python? ideas (deprecated)	8	3309	March 25, 2019
Rust's operator overloading doesn't scale ideas (deprecated)	5	4292	March 25, 2019
Associativity, commutivity, etc. for standard operators language design	46	3453	December 22, 2021
Vector Concatenation language design	57	2979	May 9, 2021
Current syntax	17	5562	March 25, 2019

Add operator @ for matrix multiplication

Related topics