Since binary size and runtime performance (not to mention compile-time performance) are often in direct conflict, my expectation has always been that --release on its own would optimize for runtime performance, and you’d have to explicitly ask for size optimization.
I do think it’d be reasonable for some of the size optimizations in that post to fall under the default behavior of an “optimize for size” compilation flag. Apparently rustchas such a flag, though I can’t find any documentation on the current status of it or how to actually pass it. But it seems reasonable to me for such a flag to enable a more aggressive level of LTO and symbol stripping than you’d get from regular --release, presumably at the cost of compile time.
Of course, there’s also several things in that post which we probably shouldn’t lump into that flag:
no_std and panic=abort obviously can’t work as implicit optimizations (though most programs that actually need to be this tiny are probably targeting embedded systems and will want to do these two things anyway)
changing the default allocator seems like way too big a semantic change, especially since its effect on runtime performance depends heavily on the program. But that’s conveniently moot because apparently we recently made the system allocator the default anyway
executable compressors just seem out of scope for a compiler (they’re separate tools for a reason, right?), though I’m not too familiar with them
I believe Soni is referring to the compiler choosing to compile generic code not via monomorphization (generating N separate functions for foo<i32>, foo<MyStruct>, etc) but by one function that takes its generic parameter as a runtime argument wrapped in some sort of dynamic dispatch machinery (which would make it conceptually similar to a trait object). Idea: polymorphic baseline codegen is one past thread on the subject.
In the context of this thread, ignoring all the implementation challenges and cases where it would be semantically incorrect anyway, I just don’t know if the compiler would be able to tell when dynamically dispatching generic code is a win for size. It obviously isn’t an unconditional win, because monomorphization enables all sorts of other optimizations, so it’s entirely possible all the monomorphizations you care about optimize down to almost nothing and end up being faster and smaller than the dynamic dispatch machinery would. I suspect it’s something a human would have to opt-in to.
I’d be curious to hear if there are any embedded developers that have actually used trait objects over generics in real code to reduce binary size.