Matrix type

Does rust have a matrix type ? If yes how do we implement it ? Or can we define it like in C ?

There's no built-in matrix type, but nalgebra and ndarray are commonly used, and there are more options out there for different niches as well.

7 Likes

Thanks for this !! But is there any reason rust did not allow the creation of a matrix like cpp or python ?

Rust has [[f32; 4]; 4], which is exactly as much of a built-in matrix type as C++ has.

(Well, excluding valarray, which as far as I can tell nobody ever uses.)

3 Likes

You might also recognize ndarray as being similar to numpy.ndarray, but that's not built-in to Python either.

4 Likes

Thanks alot for the help. I'm quite new to rust and have not been able to find this solution anywhere in he documentation. Is there any source I can refer to for these kinds of things ?

For these kinds of problems (the standard library does not contain this functionality, so are there crates I could use instead) you can always consult the Rust Cookbook.

It even has a page about linear algebra and thus matrices:

I searched for "rust matrix library" and nalgebra was the first result.

Lib.rs also has pretty good library discovery and search.

3 Likes

This thread probably should have been on the users forum, since it's about using Rust.

Specifically, let matrix: [[f32: 4]; 4] in Rust is essentially std::array<std::array<float, 4>, 4> matrix in C++ or float matrix[4][4] in C.

This is generic type composition. The array type syntax [T; N] represents a family of types, with T being any type and N being any unsigned integer. So the "trick" is to make an array of an arrays. You can do the same thing with the other arrayish types as well (slices &[&[f32]], vectors Vec<Vec<f32>>), or even mix them (e.g. &[Vec<[f32; 4]>]).

That's why there isn't really documentation on two dimensional arrays/slices/vectors – they aren't in any way interesting; they're just composed types.

5 Likes

Not mentioned previously, but there's also mint library which adds just minimal type signatures meant to offer interoperability between different crates. nalgebra is an "all batteries included" solution but it can be slow to compile. Though it has almost everything you'll ever need and is very composable. glam crate has more discrete implementations for vectors and matrices, which are ideal for rendering and such. So if you're not making a library that allows switching type representations go with that.

I usually default to glam unless I need to generify number representation in data structures that use vectors and matrices which is simpler to do with nalgebra.

There's also a bunch more other crates which overload operators and make lin.alg. types more convenient to use, but these two are most common in the wild AFAICT.

Got it. But i'm just trying o understand why there isnt a built in type like c++. Does it have something to do with memory safety ? Matrix is extremely important in image processing and would a built in type would help.

As has been said previously, C++ doesn't have a built in matrix type. What type is it that you think is a built in matrix type?

1 Like

To answer this very generally, it is hard to make something that can be generally used by most libraries and users, as a piece of "vocabulary" of the language. Such things need to be approached very carefully. If there isn't really one definitely best way of doing it, or it doesn't directly expose OS/computational resources, it isn't automatically on the list.

Some people want to treat matrices as big grids of numbers. Some people want them to be abstract mathematical objects. Some people want to be able to fluently use sparse representations. When a faster version of matrix multiplication drops, is std allowed to change its implementation? What if that wildly changes the numeric stability for some downstream users? Will some users be forced to either migrate to a matrix crate anyway, while everybody else continues using the subpar std version?

Rust exposes a kind of matrix, [[T; N]; M], with a memory layout and minimally general API basically everybody can agree upon, it can be shared between libraries. Rust is also sufficiently expressive that matrix implementations outside of std can be just as powerful as one implemented in std, there are no extremely convenient magic powers that a std::matrix would have over some library.

3 Likes

One thing C (not C++) has over Rust is multidimensional arrays with variable lengths. So in C you can do this:

void matmul(size_t m, size_t n, size_t k,
    const float A[m][k],
    const float B[k][n],
    float C[m][n]
) {
    for(size_t i=0; i<m; i++)
    for(size_t j=0; j<n; j++)
    for(size_t h=0; h<k; h++)
        C[i][j] += A[i][h] * B[h][j];
}

(So, that's taking three integer dimensions and three pointers as arguments.)

In Rust it's easy (and much more enjoyable) to write very similar code with const generics:

fn matmul<const M: usize, const N: usize, const K: usize>(
    a: &[[f32; K]; M],
    b: &[[f32; N]; K],
    c: &[[f32; N]; M]
)

... but if you'd like to have variable dimensions like the above C-function, you have to throw all that out and start from raw pointers. (Okay, you could still use flat slices but that's awful for abstraction.)

One potential solution to this might go something like "relax the implicit T: Sized -bound in [T] to allow T = [U] where every element has the same length", although that still couldn't properly express signatures where different slices should have the same length.

1 Like

That's roughly equivalent to taking a slice of slices, but it's a bit uglier to use since Vec<Vec<T>> can't coerce to &[&[T]]. Instead you can take impl IntoIterator<Item = impl IntoIterator<Item = T>> or the like, but it's a bit confusing.

They are optional though. In particular MSVC doesn't support them

Huh, looks like it was included in the C99 standard (not optional), then C11 made it opt-out. Of course, "the standard requires feature X" and "the compiler supports feature X" are very different things... Especially when it's one of the few features of C that aren't part of C++. C23 is again limiting that opt-out to stack variables, so code like my example should be supported.

That aligns well with my experience that the biggest problem of C's VLA is that float foo[n]; just works for a variable n (it stack allocates 4*n bytes). That makes it extremely easy to accidentally overflow your stack, especially for anyone coming to C from managed languages where similar simple code is the normal thing that always works (heap allocation).

It's semantically similar, but &[&[T]] is quite bad for any kind of efficiency (The double indirection adds latency to element access. Harder to eliminate bounds checking since the slices could be different lengths. If you just have the data in one slice, are you going to construct a new Vec just to hold the row-references?). IntoIterator can work for abstraction, but you miss out on the slice-specific APIs, like split_at_mut, sort and binary_search.

It's semantically similar, but &[&[T]] is quite bad for any kind of efficiency (The double indirection adds latency to element access. Harder to eliminate bounds checking since the slices could be different lengths.

Semi-related, but there's an idea I've had for user-defined unsized / fat-pointer types which would handle this.

Suppose you have the following type:

struct Matrix<const W: usize, const H: usize> {
    values: [[f32; W]; H],
}

The idea is that you could express the type of dynamically-sized matrices using something like the following syntax:

dyn<const W: usize, const H: usize> Matrix<W, H>

More generally, for any type with generic parameters you could use this dyn<..> construct to lift some or all of those parameters into the pointer metadata and carry them around at runtime. With this, [T] would become syntax sugar for dyn<const LEN: usize> [T; LEN] and dyn Trait would become syntax sugar for dyn<T: Trait> T.

There's some problems with what exactly would be supported on these types. Not all of a generic type's associated methods/types etc. would be able to carry across to its dyn<..> types. And whether a method is dyn-compatible would depend on more than just its type-signature but also its implementation. eg:

impl<const W: usize, const H: usize> Matrix<W, H> {
    // This would be usable on dynified matrices since the compiler could
    // generate a version that reads W and H at runtime from the self pointer.
    pub fn mul_scalar(&mut self, scalar: f64) {
        for i in 0..H {
            for j in 0..W {
                self.values[i][j] *= scalar;
            }
        }
    }

    // This would not be. At least not without dynamically-sized return types.
    pub fn transpose(self) -> Matrix<H, W> {
        todo!()
    }

    // This would not be. At least not without dynamically-sized stack values.
    pub fn rank(&self) -> usize {
        let mut tmp = *self;
        todo!("gaussian elimination on tmp")
    }
}

Still, I think this could be worth exploring. It feels like the "most natural" way to generalize [T] and dyn Trait to user-defined types. And it would allow doing some things that I've wanted before, eg. have a slice of dyn Trait values which are all the same type: dyn<T: Trait> [T].

4 Likes

It's seen as a problem in lower-level use-cases too, e.g. the Linux kernel uses this:

# Variable Length Arrays (VLAs) should not be used anywhere in the kernel
KBUILD_CFLAGS += -Wvla

(with optional -Werror set elsewhere)