Rolling out (or unrolling) struct field reorderings

I don’t think we should make repr(Rust) layout guarantees. If you need these guarantees, you should have to opt in. I see the appeal of doing it, but I much prefer a future-proof design.

Going further and optimizing in the way Swift does–fields of a parent struct can be placed in the padding of a child struct, provided the alignment works out right–is a possible future direction that’s come up. I don’t personally want to do it; it makes this refactor look like hello world, as far as I can tell. But that doesn’t mean we should preclude it by locking ourselves into specific layout guarantees by default.

Opting out of layout guarantees is a non-starter because then you’re not optimizing anymore. No one will use it save in places they think it matters, whereas opting in means that we hit all the places, including the ones you thought it mattered in but also including the ones you don’t but it actually did.

1 Like

I’m hoping to get to the pull requests needed to finish this this weekend or next week. I’m still alive, I promise.

Man, I forgot about this. =( I am trying to remember the current status – I think we disabled the reordering both on beta and nightly? In that case, there is a lot less urgency, but I definitely want to see the reorderings get re-enabled! Do let us know if you need any assistance!

We should also ensure that we follow-up with the ideas about randomized layout and giving people better control over layout.

I will need some assistance, yeah. I’m not stuck, just life happened.

It’s currently disabled everywhere, which is why I’m not putting this at the top of my list; there is some urgency because it is possible for bugs to be introduced while it’s off, but the code shouldn’t be affecting anyone adversely.

I’ve started work on this, but am hitting a sticking point; I’m not sure if I should start another thread or not. I’m going to see if I can’t figure this out myself, but it appears that everything to do with -z reorder-fields is going to have to be in libsyntax, with which I’m not at all familiar. Can someone point me at an overview of how we get from the AST to HIR and, specifically, where I need to go to insert ReprLinear?

I wanted to just have lookup_repr_hints add it, but the metadata goes explicitly through the HIR itself, so it needs to be in before that point.

Okay, question time. Do we actually need to load the file that -z reorder-fields=dump would make? @eddyb says not because the traversal is deterministic, but I say yes because you’re potentially binary searching across an entire tree of dependencies and the rustc invocations need to know about each other.

Who should dump the file? I think this has to happen in typeck as we create the structs and/or enums.

I saw you and @eddyb discussing on IRC. I figured that yes, we would load from the file, precisely for the reason that you state – that all of the crates that are compiling in a given cargo build session contributed to the file, and that we need to select some subset of the structs (which might spread across crates).

(But note that I think we are just loading a bunch of strings, we don’t have to convert them back to def-ids. We can just load a bunch of strings, convert the struct def-ids into strings, and check them for membership in the set. So we don’t need a parser for the debug output or anything, but we do have to assume that the user has not modified the program during the binary search. That would seem to be just unfair.)

I saw also that @eddyb had expected to remove this code; I guess I had not, I figured it would be a useful debugging tool that we would keep.

That said, it's awfully painful. I wonder -- if we expect to be doing this sort of bisection on a regular basis -- maybe we should pursue the idea of optimization fuel instead? (Sorry, @camlorn, to distract you.)

The hacky version of the idea is not dissimilar, just a bit more abstract. The file is replaced by an integer. In "dump" mode, the compiler loads the integer and adds to it as it goes. In "check" mode, the compiler subtracts, and stops "optimizing" once it reaches 0. A less hacky version could be aided by cargo.

Oh, and speaking of cargo, I guess that cargo builds run in parallel? Presumably you can disable that somehow though when doing this sort of optimization.

The thing I like about the "fuel" approach is that we could re-use it for other such optimizations in the future (i.e., you have the same fuel infrastructure, but you apply it to different tasks). But probably we could use an expanded variant of the def-path strings (i.e., Optimization(DefPath)) and those don't have the "parallel build" problem.

I'm certainly sorry that this is such a pain. =( I feel like it's an awesome tool for us to have, though -- this kind of thing makes all the difference when you're staring down some kind of crazy crash, no? In my ideal world, we'd have it not only for struct reorderings but for more and more sorts of optimizations (which is why I got to wondering about fuel). I think all in all this would make a good topic of discussion for the "compiler design sprint" that just happens to be going on today. =)

@camlorn

OK, we discussed in the sprint and came to the following design. The key idea is that we want to use per-crate fuel and leave the cross-crate stuff (almost) entirely to a wrapper (which we will not necessarily write immediately).

Basically the compiler will have a flag -Zfuel=crate=N. The idea is that if you are compiling a crate named crate, then you have fuel N, otherwise you have infinite fuel. If the option is not specified, you have infinite fuel. Fuel is stored in the session (or some such place). Each time we do an optimization decision, we decrement current fuel. When counter transitions from 1 to 0, we will print out a note indicating the decision we opted not to do. I envision something like this:

if sess.consider_optimization(|| format!("reordered fields of `{}`", def_id)) {
    /* reorder fields */
} else {
    /* do not reorder fields */
}

This of course does not allow you to check across crates, but you can imagine a wrapper tool (which we should eventually write) cargo-bisect-fuel that will (a) get the list of crates from cargo and (b) recompile many times, setting the fuel to -Zfuel=X=N where X is each of the crates in turn.

One tricky bit is finding the total fuel needed for a given crate: we could either add an option to print it out, or just require the tool to kind of experiment around to find it. I don’t care too much either way. Probably better to add -Zprint-fuel for now, though, and have the compiler emit the total fuel that it used in the current session.

Sound good @camlorn?

Yeah, this seems reasonable. I can probably code most of it. I imagine that -zprint-fuel=crate instead of -zprint-fuel though? Or I guess probably both.

Does the session know what crate we are in?

Where do we do the -zprint-fuel printing?

We might want to put optimization checks behind a macro so that they can use println instead of format. My understanding (via @eddyb) is that the printing macros can avoid allocating. It can also avoid building the string if we’re not compiling with -zfuel.

I don’t mind doing this. The inevitable blog post (and my resume) are becoming very interesting indeed.

Ah, yes, -Zprint-fuel=crate I think.

Argh, I think it doesn't. This is super annoying. The tcx does, you can do tcx.crate_name(LOCAL_CRATE). I've been meaning to refactor that forever.

Well, the idea is you only invoke the closure once per compilation, I don't think the overhead is significant.

You rock.

To be clear, though, I don't really care precisely how we do it.

The fact that it was a closure didn’t occur to me until you pointed it out.

I’ll probably go with what you are suggesting. I don’t like the idea of having to put the macro somewhere unrelated so that it can be seen by the whole compiler. Here’s hoping we get namespacing for them sometime soon.

I’ve got a prototype of optimization fuel building now. I’ve reenabled reordering and plan to refactor tests to pass, then open a PR. This may happen tonight; it may not. My design is as follows, in case someone wants to suggest an improvement:

  • Session gains a fn consider_optimizing<T: Fn()->String>(&self, crate_name: &str, msg: T) -> bool, which uses interior mutability on the session to count fuel down. This function also counts up if it detects we should be printing. This sadly involves something like 5 new fields.

  • TyCtxt gains a fn consider_optimizing<T: Fn()->String>(&self, msg: T) -> bool, which just calls the version in session.

I don’t know where to put the -zprint-fuel code yet. If someone wants to suggest a place, that’d be helpful.

I don’t like this design, but I can’t find a better place to put the actual fields; the session seems like the only thing that’s around for the whole compile cycle.

I applaud the debugging and correctness focus of field reordering, but want to reiterate (as a distribution maintainer) that reproducible builds are critical to security, efficient updates, and process.

The Debian project has a helpful page that reflects what’s important to distribution maintainers.

Regardless of the solution chosen, as long as distribution maintainers have a way that they can (easily) be guaranteed to get deterministically reproducible binaries given the same source input and same compiler binaries, I think most of us will be happy.

Everything implemented at the moment produces the same binary given the same compiler and input. I don’t think anyone is seriously proposing that this change, not without you going out of your way to request it.

Just so that people are in the loop: I’ve got a prototype of optimization fuel but couldn’t test it because I was getting weird things that looked like they were unrelated to my stuff. Then another project that I thought was done with but which I have to give higher priority decided it wasn’t done. Trust me, Rust compiler dev is way more fun.

Anyway, the one bit of good news is that field reordering doesn’t fail any of the tests locally once re-enabled; I did get that far.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.