Option for rustc to build the smallest executable possible?

I was reading this link: https://lifthrasiir.github.io/rustlog/why-is-a-rust-executable-large.html and it kinda makes sense bout why Rust executable are relatively big without any optimization (actually a bit scary when I compare to Go executable which include a GC…) but still surprise that the --release flag does not make it that smaller out of the box.

I think most people expect that a --release flag provides the smallest optimized executable, without obviously the debug symbols and everything.

I am wondering if there is any plan to bring another build (via an option or something not too obscure) option to rustc that would make the smallest possible Rust, out of the box?

I am pretty new to Rust and I formerly posted that there: https://github.com/rust-lang/rust/issues/56687#issuecomment-445974716

Since binary size and runtime performance (not to mention compile-time performance) are often in direct conflict, my expectation has always been that --release on its own would optimize for runtime performance, and you’d have to explicitly ask for size optimization.

I do think it’d be reasonable for some of the size optimizations in that post to fall under the default behavior of an “optimize for size” compilation flag. Apparently rustc has such a flag, though I can’t find any documentation on the current status of it or how to actually pass it. But it seems reasonable to me for such a flag to enable a more aggressive level of LTO and symbol stripping than you’d get from regular --release, presumably at the cost of compile time.

Of course, there’s also several things in that post which we probably shouldn’t lump into that flag:

  • no_std and panic=abort obviously can’t work as implicit optimizations (though most programs that actually need to be this tiny are probably targeting embedded systems and will want to do these two things anyway)
  • changing the default allocator seems like way too big a semantic change, especially since its effect on runtime performance depends heavily on the program. But that’s conveniently moot because apparently we recently made the system allocator the default anyway
  • executable compressors just seem out of scope for a compiler (they’re separate tools for a reason, right?), though I’m not too familiar with them
1 Like

There’s opt-level at s or z, to optimize for size.

Since that article, we have also switched to the system allocator by default on all platforms.

2 Likes

prefer pseudo-trait-objects over monomorphization?

1 Like

What are pseudo-trait-objects?(as opposed to trait ovjects)

2 Likes

I believe Soni is referring to the compiler choosing to compile generic code not via monomorphization (generating N separate functions for foo<i32>, foo<MyStruct>, etc) but by one function that takes its generic parameter as a runtime argument wrapped in some sort of dynamic dispatch machinery (which would make it conceptually similar to a trait object). Idea: polymorphic baseline codegen is one past thread on the subject.

In the context of this thread, ignoring all the implementation challenges and cases where it would be semantically incorrect anyway, I just don’t know if the compiler would be able to tell when dynamically dispatching generic code is a win for size. It obviously isn’t an unconditional win, because monomorphization enables all sorts of other optimizations, so it’s entirely possible all the monomorphizations you care about optimize down to almost nothing and end up being faster and smaller than the dynamic dispatch machinery would. I suspect it’s something a human would have to opt-in to.

I’d be curious to hear if there are any embedded developers that have actually used trait objects over generics in real code to reduce binary size.

2 Likes

I’ve tried to capture how to minimize the size of Rust binaries on my min-sized-rust repo. It’s still a work in progress.

1 Like

Just an FYI (because this confused me a little too), that is not stable just yet. Will be in 1.32.

2 Likes

Thanks man, this is really awesome!

1 Like

You can have more control over code generation by using abstract data types.

pub mod stack {
    pub struct Stack<T> {
        v: Vec<T>
    }
    impl<T> Stack<T> {
        pub fn new() -> Self {
            Stack{v: Vec::new()}
        }
        pub fn push(&mut self, x: T) {
            self.v.push(x);
        }
        pub fn pop(&mut self) -> Option<T> {
            self.v.pop()
        }
    }
}

pub mod stack_polymorphic {
    use std::any::Any;
    use std::marker::PhantomData;
    pub struct Stack<T> {
        v: Vec<Box<dyn Any>>,
        _marker: PhantomData<T>
    }
    impl<T: 'static> Stack<T> {
        pub fn new() -> Self {
            Self{v: Vec::new(), _marker: PhantomData}
        }
        pub fn push(&mut self, x: T) {
            self.v.push(Box::new(x));
        }
        pub fn pop(&mut self) -> Option<T> {
            match self.v.pop() {
                None => None,
                Some(x) => {
                    if let Ok(x) = x.downcast::<T>() {
                        Some(*x)
                    }else{
                        unreachable!()
                    }
                }
            }
        }
    }
}

type StackADT<T> = stack_polymorphic::Stack<T>;

fn main() {
    let mut a: StackADT<i32> = StackADT::new();
    a.push(1);
    a.push(2);
    println!("{:?}",a.pop());
    println!("{:?}",a.pop());
}

One might define more variants:

type FastStackADT<T> = stack::Stack<T>;
type MediumStackADT<T> = stack::Stack<T>;
type SmallStackADT<T> = stack_polymorphic::Stack<T>;
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.