Current rust nightly is using a snapshot of future LLVM 8, current stable rust is using a snapshot of LLVM 7 (while LLVM 7 is now released). AFAIK, in both cases, LLVM is patched for multiple things, some of which can be miscompilation issues.
Parallel to that, rust supports building against a system LLVM, which downstreams (e.g. Linux distros, BSDs, etc.) are very likely to do. The current minimal version of LLVM supported is 5.0. Which means when you get rust from a downstream, you may be using a LLVM that is much older than the one used if you get rust via e.g. rustup. Or newer, but that’d usually be less of a problem (and much less likely).
Now, for, I guess, time reasons, Travis builds that happen when you create PRs against the rust repository are using a system LLVM: building LLVM for every PR would make those builds probably take twice as long as they currently take, if not more.
All would be fine if the last part of the first paragraph was not happening: the bundled LLVM is patched, and in some cases it’s patched for miscompilation problems. The consequence of which is that downstream rust may have some miscompilation problems because it’s using an unpatched system LLVM.
You might think it’s mostly an hypothetical problem and is not causing practical issues. Let me introduce you to PR50882. That was an attempt to make the Box
type aware of custom allocators. It failed miserably in a seemingly completely unrelated way on Travis. Why? As it turns out, after some investigation, because of a miscompilation bug that was fixed in LLVM 7. To add insult to injury, it had been found in older versions of rust, and backported to rust-llvm back when it was based on LLVM 6. It was found early enough that it should have been in a dot-release of LLVM 6, but never was. Even better, it’s a regression that happened some time between LLVM 3.9 and LLVM 5. So in fact, my very first attempt at PR50882 worked fine. Things were left to bitrot, and PR51899 changed the version of LLVM used on Travis and suddenly my problems started. I recently refreshed the patches in a new PR, and unsurprisingly, it still fails for the same reason.
One way out of this current mess would be to bump the system LLVM requirement to version 7. But:
- It’s barely a month old, and it’s barely available downstream.
- As a matter of fact, it’s not even in Ubuntu at all.
- Even less in the version of Ubuntu that’s used for Travis builds (16.04). I did open PR55176 with an tentative patch using apt.llvm.org, but those repositories are shipping the tip of the release branches, which don’t actually provide fixed releases, but daily snapshot of the release branches, so depending on the day you pull the packages, you get a different version, which is not really desirable for CI.
In any case, I’m not aware of an official policy wrt. LLVM, and it’s about time to have one.
I don’t think picking random trunk versions of LLVM is sustainable, but I can understand why it’s being done. But we have to acknowledge the problems it causes, and the fact that we’re not testing those possible setups. In fact, if anything, my problems also highlight that rust-llvm was fixed for miscompilations without a unittest, which would have caught the problem when upgrading system LLVM to version 5.0, instead of hitting some poor random developer trying to just have their stuff done.
I also don’t think the Travis builds should be testing a setup that, from rust-lang.org’s perspective, is not standard. I can understand using system LLVM saves time, but it feels like we should build the in-tree LLVM and cache it instead. It is, IMHO, desirable to test system LLVM as well, but it shouldn’t be the first thing that is tested. And surely, testing only a single version of system LLVM is not really getting us anywhere, as the history of PR50882 showed.
Cc: @alexcrichton