background & motivation
As an engineer working in advanced robotics, I have to write a lot of computation code involving 3D geometry, statistics, computer vision and machine learning, using vectors, matrices, raw buffers, series interpolation, etc. I was working with Python but for code performing computations and complex algorithms, I needed second faster language. Among the languages I knew, I choose Rust and used it during the 2 last years.
Did I found Rust good for it ? I'm writing this report as a feedback of what a scientist programmer can think of Rust. Not about the features and frameworks available for rust for such use (I know they will grow with time), but about the language itself and its capabilities. I did not found a similar synthesis on the internet despite the several places I saw mentioning the following points.
no genericity of size
problem
This problem is mostly affecting code working with multiple dimensions, or with buffers of multiple channels (in fact most of scientific applications involves it). In such code, writing generic types with a generic size is extremely useful to keep readability. And optimizing performances need static sizes. However a lot of operations need to play between dimensions and produce objects with a different dimensionality deduced from the inputs (think of concatenation or homogeneous matrices for instance). The problem is that Rust do not allow deducing const generics expressions.
There is few ways to deal with this problem, but none is satisfying:
- crates featuring matrix and vector types are often relying on type-based dimensions and using macros to recursively generate rules of dimension concatenation, like does
nalgebra
. This is extremely verbose and hard to master for anyone wanting to write code using such generics. This also cannot cover all the possible dimensions combinations so usually stops to dimension increments of 1 - some crates restrain to low dimension (dimensions up to 4) and duplicate their operators implementations 4 times, like
cgmath
. - some crates restrain to dynamic dimension, like does
ndarray
, making dynamic allocations necessary in most operations, preventing some compiler AVX optimizations, and bringing a memory and performance overhead for small objects. - My personal workaround for this is to always have 2 linear algebra crates in my project:
nalgebra
for small arrays, andndarray
for large arrays. Unfortunately their API and feature set do not match so this is introducing a lot of conversions and confusion.
solution
I have nothing more to propose to what is currently on the tracks, a way to implement relations between usize
generics in rust is already in testing , I unfortunately does not seem to progress fast. C++ actually features it, allowing the existence of user-friendly libraries like eigen
no genericity of numbers
problem
When I'm speaking of numbers I'm speaking of integers and floating points. each one of them is a primitive type implementing almost only similar methods, But there is no trait to generalize this.
- there is 10 different ints (and might be more in other crates) but no trait
Int
. - There is 2 different floats (there might be more in the future in core or in other crates), but no trait
Float
This mean any function performing calculations has to be specific to a combination of integers and floats. In practice, because of this:
-
some crates chose a specific numeric precision: they choose between
f32
andf64
and then use the matching signed and unsigned integers. For instanceparry3d
orcollider
but also many other are doing so. -
some other crates implement 2 or more times the same functions with a different primitive each
-
some crates are relying on
num_traits
which is trying to provide traits for numbers, but these traits often have less methods than bare floats and ints. So using them leads to have some functions using only methods fromcore
and some other using only methods from the traits. Bringing confusion.As it is a third party crate and not the standard, any crate relying on it may be incompatible with an other crate using an other dependency for numeric traits.
proposal
Traits are meant to generalize properties of structs when they can be. I suggest new traits in standard rust. They wouldn't be intended to represent mathematical details of primitives but simply their structural similarities
-
Int
providing their current methods to the primitive integersimplementable by any user type strictly being an integer according to mathematical definition
-
Float
providing their current methods to the primitive floatsimplementable by any user type strictly being a floating point number according to mathematical definition
-
Unsigned
providingInt
only for unsigned ints -
Signed
providingInt
only for signed ints
On the other hand, the following traits could stay in dedicated crates as they cover more mathematical-specific aspects of the numbers, and do not provide methods redundant to core
-
One
providing a constant1
(neutral element forMul
) -
Zero
providing a constant0
(neutral element forAdd
) -
Number
providingAdd<Self>
+Sub<Self>
+Mul<Self>
+Div<Self>
+One
+Zero
Of course programmers should be advised to always prefer using generics rather than dynamic types for numbers so that the compiler could inline the matching processor instructions instead of calling a virtual table.
no genericity of litterals
problem
Assuming we had traits for numbers (like proposed above or proposed by num_traits
), then writing code with it leads to the following problem: Literals are not supported for generics
pub fn hermite3<T: Float>(a: (T,T), b: (T,T), x: T) -> T
{
let x2 = x.powi(2);
let x3 = x.powi(3);
// none of these line are accepted by the compiler because the literal constants are f32 or f64 but not compatible with T
(a.0 * (2.*x3 - 3.*x2 + 1.))
+ (b.0 * (-2.*x3 + 3.*x2))
+ (a.1 * (x3 - 2.*x2 + x))
+ (b.1 * (x3 - x2))
}
The problem here is that most numeric computation code use hardcoded constants in math expressions (and it is the right way to do, not an ugly habit), preventing them to use generic numbers.
To overcome this limitation, the community developed tools like numeric_literals
to ugly-trick the rust parser: crates playing with the AST to replace literals by conversion expressions.
#[replace_float_literals(T::from_float(literal))]
pub fn hermite3<T: Float>(a: (T,T), b: (T,T), x: T) -> T {...}
There is still limitations though:
- it needs a conversion header each function
- we can only hope the compiler will optimize out the conversion and not keep it at runtime
- we can no more mix multiple number types
T
andV
in the same function because all literals are now converted toT
proposal
As a minimalistic change, rust could implicitely determine the type of a literal constant based on the type of its operand. then parse the literal accordingly like it is already doing when a _T
type annotation is present.
As a much generic and aestetic solution, rust could use a compile-time constant constructor T::from_literal(&str)
for every type supporting literals. Primitive types would then implement a trait FromLiteral
providing this method. The compiler would seek the type of literal constants based on the type notation _T
if present, or based on the type of its operands.
difficulty of conversions
problem
following implementation exists: From<f32> for f64
however From<f64> for f32
does not. Instead TryFrom<f64> for f32
has to be used.
Likewise, From<i32> for f32
(and bigger ints) and From<usize> for f32
(and bigger ints) do not exist either.
Because of this a lot of code using floats for computations but involving integers for eg. array sizes or indices, tend to rely on as
conversions, which is not generic nor really recommended in idiomatic rust. Because it is not generic it doesn't extend to structs containing numbers that we want to cast: [u32]
doesn't cast to [f32]
as u32
does. likewise, ndarray<u32>
doesn't cast to ndarray<f32>
It is easy to understand that such conversions are not implemented in core for the reason that they would be lossy. However:
- Using integers where every computation is exact (except when overflowing), any lossy operation consist in a non-intuitive step and thus must be explicit and exceptions noticed in a
Result
. So between integers of different sizes,TryFrom
is relevant. - Using floating points however every computation is lossy, this is not an unexpected but a desired behavior. This is the tradeoff for handling computations with any order of magnitude. Nothing unexpected can come from this loss of precision because IEEE standard is specifying this behavior. So loosy computation is not only usual but always, so
TryFrom
between floats and ints is not relevant.
proposal
Since at the moment the programmer is using a float, precision loss is always expected and no side case can happen, I consider From
should be available from any primitive number to any float. it would nicely complete the genericity of numbers and would avoid writing _ as _
everywhere but instead T::from
in a more idiomatic and functional style.
If From
is only for exact conversions (it is not at the moment), then a Lossy
conversion trait (that do not return a Result
) should be added in standard rust and implemented from any primitive number to any float. I personally think using From
would be much clearer than yet an other conversion trait.
no function overloading
problem
This point is only an issue because of the lack of genericity of numbers. Overloaded functions are the way some other languages are dealing with the previous limitations.
function overloading is what allows having the same function name to be used by different functions implementations and signatures. It is relevant when an operation has to be implemented for different types with conceptually the same meaning, but implementations must be different
fn foo(value: f32) -> f32 {...}
fn foo(value: f64) -> f64 {...}
foo(1.2f32);
foo(1.2f64);
No genericity here, but at least if you want to write a function bar
implemented for serveral floats types using foo
you can copy paste bar
's implementation changing only the argument type. (Not very aesthetic, but as I said it is a workaround, C is working like this). The point is it allows providing functions supporting different types without generics. That could be useful in Rust if the genericity of numbers is not fixed.
For base math functions rust is not using overloaded functions but methods, pretending it is removing the need for overloading and genericity. But it is a lure since from other crates we cannot add methods to foreign types from the std
Using overloaded functions of multiple arguments we can also cover the issue of conversions:
// with function overloading, one overload for each combination of number precisions
fn pow(a: f32, b: f32) -> f32
fn pow(a: f32, b: i32) -> f32
fn pow(a: f64, b: f64) -> f64
fn pow(a: f64, b: i32) -> f64
pow(1f32, 0.5f32);
// without, like in standard rust, we duplicate names each combination
impl f32 {
fn powf(&self, b: f32) -> f32
fn powi(&self, b: i32) -> f32
}
impl f64 {
fn powf(&self, b: f64) -> f64
fn powi(&self, b: i32) -> f64
}
1f32.powf(0.5f32);
proposal
function overloading will be not really needed if the genericity of numbers and conversions are fixed. function overloading is less idiomatic and much less powerfull than traits, so I would prefer the previous issues to be fixed rather that function overloading to be added. So it can stay the way it is.
conclusion
To the question I raised in introduction: Did I found rust good for science ? My answer is an other question: Did the rust developers forgot that numeric computation is half the use cases of a fast compiled language ?
I think rust trully lacks the following features:
- genericity of size
- genericity of numbers
- genericity of litterals
- lossy conversions for floating points
And despite the fact many crates are trying to patch this, it is still a pain to write numeric computational code in rust. These issues and their workaround are greatly increasing the complexity of the code written, and this is something a scientist really do not need when implementing algorithms that are already complex. Of course I am also using Rust for networking and data management code which Rust is really good for. Sadly at the moment, it is easier to write computational code in C or C++.
I personally will continue to use rust for computational code only because I like not having to fear memory corruptions. and do not want to mix more languages in the same projects. One may be tempted to argue that the pain brought by rust in computational code is the price for safety (it is the mainstream answer when someone points out difficulties with rust). But looking at the proposals above: none of them has to do with safety hence could be implemented in rust without sacrificing safety.
Rust has a great potential in computation for its safety, performance and abstraction capabilities. Howerver, If the rust core team never improves these points, I really fear that this wonderful language will not be able to become a first-choice language for numeric computation. And will never be able to fully replace c++