My understanding is that overall Rust builds against a released version of LLVM, likely with a small number of Rust-specific changes. Because LLVM changes tend to break the Rust bindings, LLVM upgrades are a nontrivial undertaking, and thus happen somewhat infrequently. Is this correct so far?
Rust also has asm.js and Wasm targets, which use fastcomp, which is from a fork of LLVM. At first I thought this meant Rust would change LLVM branches when building the emscripten target, but it seems like actually Rust uses one branch that has either merged in or cherry-picked fastcomp. Is this right?
Next, the Wasm LLVM backend seems to have advanced somewhat since January when it seems Rust’s LLVM was last significantly updated. Would the Rust team be open to updating the LLVM Wasm backend? If so, would it be better to do it with cherry picking, merging, subtree merges, or trying to do another LLVM upgrade?
It looks like this is our fork. There was a time when we only carried a few patches, but it looks quite monstrous at the moment.
Rust does not base it’s fork off of a released version of LLVM. It is an arbitrary commit from upstream LLVM. LLVM upgrades do break Rust’s bindings but that is usually not the hardest part of the upgrades. Rather, getting all of Rust’s supported platforms to pass CI again is. Our MinGW port is especially troublesome, as upstream support has degraded over the years.
It looks like we are currently carrying a merge of the fastcomp fork, though when I originally merged fastcomp support, we did not literally merge with fastcomp, we just created a minimal patch that created a fastcomp target with fastcomp-compatible IR.
If we try to pull in an updated wasm backend I’d prefer to upgrade LLVM from upstream and not cherry-pick more stuff.
This all sounds kind of crazy. Is upstream LLVM really so unstable that it regularly breaks platforms? Or is it that rust is using LLVM in ways that LLVM wasn’t designed for?
It could be that basing our fork off of arbitrary commits causes us more platform regressions than using an LLVM release. Presumably they catch some regressions ahead of release.
Out of curiosity, what all is in the Rust LLVM fork that’s not in regular LLVM? Are these changes still necessary, or would it be feasible to consider moving back to a vanilla LLVM release?
I went ahead and opened https://github.com/rust-lang/rust/issues/42389, since it seems like de-forking LLVM would be beneficial in the long run. Of course, there may still be reasons why the state of affairs is necessary, so the bug seems like a good place to discuss the feasibility and desirability of de-forking LLVM.
FWIW, Rust does work with vanilla LLVM. We use the system LLVM in Fedora, and probably most other distros do too. But we’re only trying to support our own native linux-gnu targets in that case.
I do apply some of the rust-llvm backported fixes to our system LLVM though.