2019 Strategy for Rustc and the RLS

How long does it take to compute the hash on parser.rs that @matklad mentioned only takes about 20ms to parse? Is skipping whole-file parsing of unchanged files worth it if the time to hash is similar to the time to parse? How close are these times?

So trying sha512sum (which is a bit overkill) takes less than 10ms, and because I don’t know what to use other than time, to get a quick timing…

I just concatenated parser.rs 100 times, giving me around:

  • 90ms with sha512sum
  • 140ms with sha256sum (why slower?)
  • 60ms with sha1sum

So it’s about 20 times faster than parsing the file, just from this quick test.
If we’re talking about RLS, it also can guarantee that its VFS files have not changed without rehashing them (it already shouldn’t be reading anything from disk).

I realize I left something off of this list. I think mir-level optimization has the potential to do a lot as well. The goal is to be able to do things like inline small functions so that LLVM doesn't have to. Also, any optimizations we perform on MIR take effect for all monomorphizations.

2 Likes

I’ve been thinking more about this. It feels to me that compile time remains a “core challenge” for Rust. We’ve made a lot of progress but we are not there and I’d like to make this a prime focus.

I propose that we split up this thread into two threads to talk a bit more in detail:

  • IDE:
    • Here we need the ability to rapidly process incremental updates and give back certain specific kinds of information. I’ve been hoping that the incremental infrastructure we’ve built up will serve for that role but this is not entirely clear.
    • In addition, I feel like we need to discuss longer term about the relationship between the RLS and rustc. What logic lives where, who will maintain it, etc.
  • Overall compilation time:
    • Quite apart from IDE support, there is lots of work to be done on improving rustc’s overall compilation time even in a traditional “run the compiler” setting.
    • Earlier, I enumerated some avenues above, though there are no doubt more. (e.g., I forgot the idea of using Cranelift as an alternative backend…)

Overall, I think we should aim for bold initiatives that will make a big difference, and not be afraid to plan out big architecture changes. At the same time, we need to structure things so that we don’t have to wait years for the payoff (as we have sometimes done in the past).

20 Likes

Second, the assumption “you need to mimic compiler exactly to be precise” does not hold in practice. Quoting myself from that other thread,

"Not mimicking the compiler exactly" would be a deal-breaker for me.

I was using IntelliJ-Rust. One of the reasons for switching out was that it was not able to get me exactly the same results as the Cargo output. This was especially true when the version of Rustc in the toolchain wasn't exactly matching the version of Rustc their analyzer was mimicking.

Because you're on a 64-bit machine. SHA-512 does 80 rounds instead of 64 for SHA-256, but each round is over 512 bits instead of just 256 bits, so it ends up ~1.6x faster on platforms where 64-bit operations are essentially the same speed as 32-bit ones. (This is why SHA-512/256 is a thing, now.)

5 Likes

Old post, but figure of someone might stumble upon this and wonder. While the SHA-512 rounds are more expensive, it has twice the block size (32 vs 64 bits) than that of SHA-256, which averages out to make SHA-256 around 1.5 times as slow SHA-512. There's also SHA-512/256 which is truncated SHA-512.

Argh, damnit. I wrote my reply, then forgot to actually send it, and when I opened the tab again it didn’t load the new reply. >.<

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.