I think “MIR-only rlibs” + ThinLTO + incr. comp. will get us pretty far already and all of those will happen sooner or later within the current architecture. Parallelizing type-checking seems to be the hardest item on this list.
@jntrnr So, has any work gone into a second edition compiler? Are there any crates/github projects etc. working on this?
While this proposal seems to be the most liked idea to this date, I do not think that this is a viable undertaking without Mozilla’s and the core team’s express support and stewardship.
I don’t think it would be wise to fractionate the community’s efforts without a clear plan of transitioning from one code base to another. I think that we are best off with one reference implementation for the time being. It’s not like with C that has been stable for decades. Rust is still undergoing a fantasic growth and development. I’d hate to see this flourishing thing we’ve got lose pace now!
I’d be grateful if anyone from the core team would share their views. Sorry if I’ve missed any comments.
@DanielFath - sadly, it’s more of a “fever dream” than something currently being planned. That said, it’s my hope that some of this can be investigated (eg splitting up the compiler into crates that others can consume). Other changes, like the deeper parallel/lazy fixes, would be pretty invasive and would likely have to show strong promise before anyone attempted them.
@repax - right, I don’t think that’d be wise either as a general rule. It’s tempting to think of the possibility of a second compiler being built that could use the newer understanding of the algorithms that are coming along (eg, just start with NLL and the new trait system to begin with). That said, I can’t think of any cases where an alternate compiler became the primary one, cpython is still the main one, the Ruby 1.9->2.0 was a core effort, same for C# and Roslyn, etc. So I agree it would need to be core driven and part of the planning, which this isn’t.
One question that bothers me is do we really need all this high-tech to make compiler fast? My understanding that compilers which are praised for speed (Go, Java, D, C) don’t do much in terms of laziness (Actually, I have no idea if this is indeed true, would be glad to hear a more informed opinion!).
Maybe it would be better to start of a bit more conservative first before considering a second edition. If the LLVM steps are taking up a lot of time, it might be worth starting a larger effort and look into what we can do to reduce the amount and complexity of output to LLVM. I think there is a lot of room for improvement there. Given the generic nature of the standard libraries, there is a lot of indirection which ends up in a fair bit of code for the llvm optimizer to clean.
As for paralellization, there is work being done on that in the existing compiler: https://github.com/rust-lang/rust/pull/43506
Laziness is generally coming from compilers built to support IDEs from the start, like C#'s Roslyn compiler and the TypeScript compiler.
Compilers that are generally “fast enough” like the Go compiler can provide IDE support using just straight-line compilation alone. Response times aren’t quite as good as lazy compilers, but may be okay for most use cases.
It’s in general just a fever dream of a post where I imagine some pretty heavy-weight ways to improve things, with the hope we generate some ideas to use later on.
@oln - I’d love for someone just to look into how we could be reduce the complexity of what we give LLVM. I know a few people have talked about looking into this, but I don’t know if anyone has really dedicated the time to do it, yet.
At least for C and Go, one aspect that probably helps a lot is that the type system is pretty much trivial. Rust’s powerful generic types and trait system provide lots of benefits, but they also make type-checking and compilation a much harder problem.
The java compiler doesn’t have to generate machine code; it targets JVM bytecode. That’s a much simpler target, and also means it doesn’t do many of the interesting (and expensive) analysis and optimizations that LLVM does – with Java, these are done at run-time by the JIT of the JVM.
I don’t know much about D, so I cannot comment on that.
Type checking and trait resolution are by far not the biggest contributors to Rust compile time. There is room for improvement, but fixing them on their own isn’t going to move the needle on the “Rust is slow to compile” meme.
Yeah, maybe that deserves a separate topic. Though, to give some perspective, I compared the LLVM IR output of a rust and a cpp version of a very simple code example: As noted in the gist, the output from the rust version is about 5 times larger. (Given that the sed command I used does what I think it does.)
Now, the number of lines of code may not be the best way to judge this. Ideally one would probably check the pre optimisation output for an optimized build, though I didn’t find a good way to do that (there probably is though.) I don’t think rustc does much in the way of optimisation itself at the moment (only thing I can find for sure is removing redundant *&.) so it may give an estimate at least.
I think part of the difference in the size of the output is the fact that rust does a lot more things than C++ through function calls at the moment. For example, checking if a pointer is a null pointer: In C++ one would usually do either
if (!ptr) or
if !(ptr != nullptr), while the rust code does this with
ptr::is_null(), which subsequently calls
ptr::null() to get the null value for a pointer. I don’t know the innards of the compiler well enough to say how large of an impact all the extra inlining llvm has to do is though.
Maybe doing some of this work a bit earlier in the pipeline, e.g on the mir level would be helpful.(Don’t think that works cross-crate).
EDIT: Alternatively something like what’s described in this issue might be helpful: https://github.com/rust-lang/rust/issues/14527. EDIT2: May actually not be too helpful given that inlineable functions tend to be generic, though the issue does shed some light on why the optimizer has to do a lot of inlining work.
Making that work requires storing MIR within the rlibs. I know that’s been discussed, but I don’t know if it’s been implemented.
Doesn’t look like it has so far..
I wonder if it would be possible to “pre-optimize” or at least “pre-inline” the internals of the functions that are stored in in rust libraries such that the compiler doesn’t have to e.g inline all the calls to ptr::is_null() and similar that are used inside other functions in the stdlib. For instance, I had a look at the output from the llvm optimizer when compiling a library I’m working on (
RUSTFLAGS="-C remark=inline") and I found 42 instances of the compiler inlining various versions of
ptr::is_null even though neither the library or any of the (2) dependencies call the function directly.
A similarish project for Scala: https://github.com/twitter/reasonable-scala/blob/master/README.md
I agree with the opt-in.
On this note, I want to say that if you want to limit the number of cores used, system-wide, and your OS is using systemd, you could use CPUAffinity=. For other variants see this answer.
I’ve a personal note from when I accidentally used CPUAffinity myself:
WARNING: CPUAffinity in /etc/systemd/system.conf is system global! So if you set 1 2 3, you get to never use cpu0! So be sure that 0 1 2 3 is set! or you’ll always have cpu0 unused (reported by top and taskmanager). “Configures the initial CPU affinity for the init process.” (maybe this implies it; that is: for example all children of init inherit its affinity and since everything is a child of init, you’re stuck with that affinity for every process!)
Regarding what I’ve said above, and in this context, I wonder: given that the guidelines say
Don’t divert a topic by changing it midstream.(which isn’t my intention) should I have used
Reply as a New Topic (which is a button?that I do not see anywhere at the moment, all I see is only the Reply button) ?
I just read through the RFC and it looks quite well thought out. You may have some issues with procedural macros though, given they are (by design) able to execute arbitrary code and don’t necessarily need to be deterministic or repeatable (diesel’s
infer_schema!() comes to mind here).
I’ve done a lot of parser work in the past and find the functional way of parsing token streams into an AST really easy to implement and test. The idea of making
libsyntax2 a normal crate that’s decoupled from the compiler internals is particularly appealing. It should make it a lot easier for people to create alternate backends and codegen tools.