So an update on the Servo situation: bisecting the source of the problem was easy, and only took about an hour of builds.
The main tool was sed -E -i 's/\b(pub )?struct /#[repr(C)]\0/' .cargo/**.rs
, on a fresh .cargo
(Servo has a separated one because it uses a custom rustc/cargo version pair).
After that, I had to fix some macros (where the sed
edited the macro input matching) and I had a working build.
With a full copy of the unchanged and modified .cargo
directories, all I had to do is copy sub-directories from either the “bad” (original) or “good” (modified) until I was left with tt_os2.rs
from the freetype
crate.
Apparently @nox already found that case 8 hours earlier (but the initial testing didn’t validate the fix) and by the time @nikomatsakis made this post, Servo’s rustup had already succeeded.
In conclusion, I think bugs from this aren’t hard to track down even without compiler support, but if we add support it should be easier to use for bisecting the cause rather than letting people defer fixing the original problems.
While improper_ctypes
is useful, @nox found this through some cast-related search, and I’d like to hear more about that, maybe we can automate it. Taint tracking at the MIR level is not out of the question, we have enough parts to make it work, at least as a lint with semi-decent signal-to-noise ratio, and we’d weed out false positives by hand.
Randomization is a good idea, and we can use it in conjunction with this. The way the “optimal packing” field reordering works is it sorts the fields by alignment, effectively clustering them. We can then shuffle each cluster without wasting memory due to padding and enable that by default, with opt-in full randomization for testing, preferably with that filter thing, say something like this:
# Bisect for structs in some_crate based on the first letter of their name.
RUSTFLAGS=-Zinject-struct-attr='some_crate::.*\b[A-M]\w* repr(C)'
# Give fully randomized field order to any non-repr(C) struct.
RUSTFLAGS=-Zinject-struct-attr='.* repr(random_default)'
(This design for the compiler flag could be helpful in other situations as well, at an extreme it could be combined with an injected plugin that processes the injected attribute to handle more complex logic)