My wish list for Rust 2024 and beyond

I'm going to share with you a wish list of things I'd like to see happen in the rust language edition 2024 and +.

I've been programming for more than 17 years (or much longer, if counting my very early beginnings),

I worked mainly with these languages (I'll only mention the ones I've used a lot):

in this order : Basic > Cobol > QBasic > PureBasic > VB > Fortran > C > C++ > ASM (x32/x86-64) > C# > Javascript > Java > Sql/PLSql > Python > Groovy > Typescript > Go > MQL4/5 > ZIG > Rust

I'm not going to count HTML, CSS/SASS, shell languages, etc...

I'll just write down what comes to mind, and I'll add more ideas to this post whenever I remember a feature.


Each language has its advantages and disadvantages, and what I'd like Rust to do is incorporate some of the strengths of each of these languages :

Basic

  • goto ( just kidding :laughing: )
  • nothing really striking here

Cobol

C

  • more freedom with Unsafe
  • ternary expression (I know there's been a debate on this subject)
  • generate a definition file ( like .h) and compile the implementation ( for those who want to share proprietary libraries without disclosing the source code )

C++

  • operator overloading
  • inheritance (inherited only the fields of the structure, without the implementation)

Java

  • enrich Macros to bring them closer to the way Annotations work (ex: annotations in function arguments)
  • allow Macros to determine type, rather than acting as a simple text parser
  • Wildcards
  • Reflection : I know it's difficult for a language without GC, but at least simulate it by adding Metadata as needed.
  • the same function name but with different parameters ( I know it's difficult for the static link, but make it possible only for non-external functions.)
  • provide the essential features needed to power enterprise frameworks that can compete with Spring/Quarkus, for example.

Go

  • compiler speed
  • rich standard library

Typescript

  • Default function arguments
  • Nullish coalescing operator (??) , for Rust it will be Nonish coalescing operator :laughing:

I'm a great believer in Rust's potential, and I'm sure that by version 2.0, it'll be a must-have language.

I think Rust should not only focus on the needs of system developers, but also offer flexibility to companies developing business applications.

Rust has become my favorite language for starting up projects, even if for professional needs it lacks fluidity, and especially advanced crates (ex: Oracle support for Sqlx, better integration with APM, etc..).

(I know that some concepts are debatable, but the purpose of this post is not to start a debate, but just to share some thoughts.)

:pray:

Hm, could you expand on this? Arguably Rust pure unsafe has less restrictions than C (e.g. untyped pointers, no UB with signed arithmetic, etc). I think the problem comes when needing to uphold the guarantees of safe types and code. But I'm not sure how to solve that beside weakening those safe guarantees?

3 Likes

I've had cases where, even when specifying "unsafe", the compiler wouldn't allow a certain action. I'll have to look for such cases in my old code, but for the instance I don't remember them.

I believe that we already have this eg. with the core::ops::Add<T, Output=N> and other traits in this regard. Or did you mean something else?

There is a reason why the trait system was chosen in favour of inheritance. I personally do prefer the trait system much much more that classical class inheritance.

This is probably not possible. Macros are instantiated at a phase at which the compiler has no knowledge about types yet (as far as I understand). The proper way to deal with behaviour that depends on the type should be dealt with by specialization. This feature is currently in development but can be tried out on the nightly branch.

We'd all love that!

I am not sure that this is possible for generic functions. Consider

fn take<T>(t: T) {}

I do not see a good way to define a default value for this function argument. Even if the Default trait is implemented, it is not correct to assume that this is the default value of this function argument.

5 Likes

There is something similar: When generating documentation, you have the option to not include the source code. It is of course not the same as a header file.

1 Like

Header files aren't very useful without a stable ABI. (Which is in progress, so you can expect a kind of header after it's available). And if the C ABI works for you, you can use extern "C"

6 Likes

I'm going to explore this option, it looks very interesting.

but you end up having to nest structures, and above all having to redo the same implementations, at best doing something like :


struct A {
    value: u32,
}

struct B {
    a: A,
}

trait GetValue {
    fn get_value(&self) -> u32;
}

impl GetValue for A {
    fn get_value(&self) -> u32 {
        self.value
    }
}

impl GetValue for B {
    #[inline]
    fn get_value(&self) -> u32 {
        self.a.value
    }
}

I'd just like there to be a little control over the types, a kind of pre-analysis, because based on just the name, you can have crates that use the same name and end up with bizarre behavior.

the default value will be set on non-generic parameters (as other languages do)

interesting, I'll see how it looks

If you want to save space, write a macro that does it for you. It is even possible to have n-layered nesting of the GetValue trait. However, I think that the more idiomatic approach (in my opinion) would be to do

impl GetValue for B {
    fn get_value(&self) -> u32 {
        <A as GetValue>::get_value(&self.a)
    }
}

This implementation only uses information about types and traits and no underlying memory layout of struct A which might change in the future.

But this is the role of the type system. Macros can work similarly but ultimately the type system is what should take care of these problems.

1 Like

wrt. Inheritance (interface inheritance) most of the discussion on this topic (as far as I am aware) has been happening in Efficient code reuse · Issue #349 · rust-lang/rfcs (github.com).

You can implement AsRef/AsMut to regain some of the interface/struct reuse without reimplementing the interface: Rust Playground

Nesting (composition) is almost always preferred even in other languages because it tends to avoid tight coupling.

2 Likes

I don't end up writing inheritance-like code in Rust very much, but I usually reach for this sort of pattern when I do:

struct Base<Ext:?Sized = ()> {
    value: u32,
    ext: Ext
}

struct NameExt {
    name: String
}

type NamedBase = Base<NameExt>;

impl<Ext:?Sized> Base<Ext> {
    fn get_value(&self) -> u32 { self.value }
}

impl NamedBase {
    fn get_name(&self)->&str { self.ext.name.as_str() }
}

This way, NamedBase is-a Base in a very real sense that the Rust compiler understands. It's also pretty close to how inheritance is implemented in C++. If you want various Exts to influence the behavior of Base's methods, you can define a trait for them and choose between static and virtual dispatch as appropriate for your application:

trait BaseExt { ... }
impl<Ext:?Sized> Base<Ext> { 
    fn reqires_delegate(&self) where Ext:BaseExt { ... }
}

Edit: Thinking about this some more, the most significant drawbacks to this approach are:

  1. All types are final by default, and must opt-in to be extended. This is different from C++, but feels like it fits the Rust design philosophy
  2. The extension trait can be quite awkward, especially if the method definitions need access to the base object's properties. This can be improved with something like object-safe arbitrary_self_types: This lets the trait include methods that take self: &Base<Self> and then allows them to be called via &Base<dyn BaseExt>
4 Likes

I did an experiment, simulating CPP-style inheritance and Rust-style inheritance. Here are the performance results (in Debug mode, because Release mode will pre-calculate the result with its optimizations)

struct Base {
    a: u64,
    b: u64,
}

struct Srust {
    base: Base,
    c: u64,
    d: u64,
}

struct Scpp {
    a: u64,
    b: u64,
    c: u64,
    d: u64,
}

fn test_perf_rust_style(max: u64) -> Duration {
    let mut count = 0;
    let start_time = Instant::now();

    for i in 0..max {
        let sr = Srust {
            base: Base { a: i + 1, b: i + 3 },
            c: i + 3,
            d: i + 2,
        };

        count = sr.base.a + sr.base.b + i;
    }
    let end_time = Instant::now();
    end_time.duration_since(start_time)
}

fn test_perf_cpp_style(max: u64) -> Duration {
    let mut count = 0;
    let start_time = Instant::now();

    for i in 0..max {
        let sr = Scpp {
            a: i + 1,
            b: i + 3,
            c: i + 3,
            d: i + 2,
        };

        count = sr.a + sr.b + i;
    }
    let end_time = Instant::now();
    end_time.duration_since(start_time)
}

fn main() {
    const MAX: u64 = 1000000000;
    let elapsed_time2 = test_perf_rust_style(MAX).as_millis();
    let elapsed_time1 = test_perf_cpp_style(MAX).as_millis();

    println!("test_perf_cpp_style : {} ms", elapsed_time1);
    println!("test_perf_rust_style : {} ms", elapsed_time2);

    let diff: f64 =
        ((elapsed_time2 as f64 - elapsed_time1 as f64) / elapsed_time2 as f64) * 100.0f64;
    println!("Diff : {} %", diff);
}

here are the results:

Finished dev [unoptimized + debuginfo] target(s) in 0.12s
Running `target/debug/fn_test`
test_perf_cpp_style : 7133 ms
test_perf_rust_style : 7579 ms
Diff : 5.8846813563794695 %

we lose about 6% performance because we don't have native inheritance in Rust :face_with_raised_eyebrow:

Respectfully, you can't compare debug mode results, especially between languages. They are entirely meaningless and not representative of the real world.

If you need to prevent the finger from optimizing everything or, you can use utilities like std::hint::black_box. Then you might get meaningful results.

6 Likes

it's the same language, both functions are in Rust and run in the same mode. :face_with_raised_eyebrow:

I tried with black_box, but the compiler kept optimizing the code and gave me 0ms, so I simply disabled optimization in the Cargo.toml file.

let elapsed_time2 = black_box(test_perf_rust_style(black_box(MAX)).as_millis());
let elapsed_time1 = black_box(test_perf_cpp_style(MAX).as_millis());

I've deactivated optization in release mode, so we don't have debugging info or optimization.

[profile.release]
opt-level = 0

the result:

Finished release [unoptimized] target(s) in 0.00s
Running `target/release/fn_test`
test_perf_cpp_style : 3648 ms
test_perf_rust_style : 3867 ms
Diff : 5.663304887509697 %

+/- same result :bar_chart:

* whether in debug or in release, you'll always get the same percentage difference, or at the very most a very small variation, since both functions are tested in the same mode.

Turning the optimizer off completely doesn't really provide much insight, as it has an important role in the performance picture. The key to using black_box correctly is that it simulates I/O without the time penalty— so you should be "printing" the output of each calculation with black_box to ensure that the calculation acutally happens.

It's also extremely hard to get reliable benchmarking results for such a lightweight operation. I tried to clean up your benchmark a bit, and add a couple more test cases, but I just can't get any significant results out of it— there's order-of-magnitude swings between the min and max measurements when repeated, resulting in really wide error bars:

         nop:   31 -   61 -  146   p95 interval: ( -6.6, 143.6)
        flat:   72 -  109 -  602   p95 interval: (-116.7, 453.6)
 base_inside:   72 -   93 -  260   p95 interval: (  4.4, 233.1)
base_outside:   72 -  123 -  231   p95 interval: ( 21.5, 247.6)
3 Likes

This topic seems to have shifted a lot since yesterday. But I totally agree with @2e71828 . If you argue about performance, use release mode preferentially on a well-controlled, cooled and fixed frequency and power-consuming device.

For micro benchmarks in linux, the path of the directory can be embedded into the application (depening on the optimization). Thus sometimes, loading this path can be enough overhead comparatively.

Benchmarking is really hard.

5 Likes

This already exists:

let x = if x < y { x } else { y };

(BTW, the name "ternary expression" is a horrible name for this; "conditional expression" is better)

4 Likes

Thanks for the effort you've put into writing this test, even if I think you've used heavy artillery for a simple test.

there's one little detail I didn't understand in your test (maybe I'm wrong) but the Flat version takes longer?! is that normal?

I know this version exists, but it's heavy compared to a simple one :

let x = (x < y)?x:y;

(and especially less readable if you had to integrate it as an input parameter to a function)

Not only is this not really a great reason to duplicate a feature (unless you were to also remove the previous notation), it would be ambiguous because ? already has another meaning, for example:

foo() ? - bar() ? -1 : 2

could be equivalent to:

if foo() { -bar()? - 1 } else { 2 }

or

if foo()? - bar { -1 } else { 2 }
4 Likes

The results are pretty unstable, so you can't really draw a conclusion about speed one way or another. My suspicion is that all of the methods are exactly equivalent, and any difference is just measurement noise.

All of the layouts are essentially the same: 4 words in a contiguous block of memory. The high-level names assigned to each of those words shouldn't affect codegen very much, if at all.

4 Likes

Whatever the playground does to run the code seems to add significant overhead and variance to the results. Compiling and running your benchmarks natively does yield better result (even with that same loop length of just 1000 iterations):

         nop:   27 -   28 -   37   p95 interval: ( 23.3,  35.2)
        flat:   82 -   84 -   93   p95 interval: ( 78.6,  93.5)
 base_inside:   82 -   84 -  102   p95 interval: ( 75.3,  97.5)
base_outside:   82 -   84 -   94   p95 interval: ( 78.4,  94.2)

(These are a bit slower because my computer probably is, but the same benchmark on the playground is 20% slower on the fastest measurements.)

Also, as you suspect, the three variations do end up with identical machine code at least for this benchmark (on x86 linux).

5 Likes