Rolling out (or unrolling) struct field reorderings

nikomatsakis · December 23, 2016, 7:32pm

The way I would do it is like this:

Add a new kind of repr that is not available externally. Let's call it reorderable.

So long as we have optimization fuel (whether it be opt-in or opt-out), we would set the repr for structs/enums in current crate to be "reorderable".

Then in your code where you look for #[repr(C)] to decide whether to reorder, you would instead look to see that the struct is "repr(reorderable)".

Does that make sense? Is there an obvious flaw with this approach?

nikomatsakis · December 23, 2016, 7:38pm

I am not suggesting we write an RFC. =) Just that we have some precedent. I am not sure when we will need it, but it is easy to imagine that we will want to make further changes which are technically permitted but which lead to strange behavior down in unsafe code.

nikomatsakis · December 23, 2016, 7:38pm

An obvious example: changes to the unsafe code guidelines.

camlorn · December 23, 2016, 7:38pm

As long as cargo rebuilds everything and reprs end up in crate metadata, I don’t think so. The only thing it won’t find are bugs caused by std, presuming that std is compiled in a reordered manner. But then, a lot of std is monomorphized later.

nikomatsakis · December 23, 2016, 7:41pm

Well, I anticipate std will follow the default, whatever it is. Once we switch to "opt out", you won't be able to recompile std without reordering, clearly, but I don't think this is a major source of problems (that is, are people transmuting structs in libstd that are not #[repr(C)]? If so, they're in for a bad time...)

camlorn · December 23, 2016, 7:43pm

It’s fine as long as it’s not too heavyweight. I thought the point was to propose some sort of process for all optimizations, but if we’re not going to do that and it’s only going to apply to some of them, I think part of this needs to be determining which ones it applies to.

It seems to me that changes to the unsafe code guidelines aren’t an optimization but rather a semantic change to the language and would fall under the regular RFC process.

camlorn · December 23, 2016, 7:48pm

Servo might be doing something odd in this regard, see this thread. Though I suppose UnsafeCell is probably a special case we don’t have to worry about, and it is known to work with reordered std.

josh · December 24, 2016, 11:14am

This sounds like it comes close to implementing the proposed repr(linear) (as the negation of repr(reorderable)).

nikomatsakis · December 24, 2016, 11:16am

Indeed, but as an internal detail.

nikomatsakis · December 24, 2016, 11:18am

One flaw: I think the "optimization fuel" idea makes sense within a crate, but it doesn't obviously extend to checking across a whole bunch of crates.

eddyb · December 24, 2016, 2:19pm

That was what my initial reaction was. Across crates you need global state. Still useful to have per-crate if you get a printout of last struct per crate though, just a bit more hectic.

camlorn · December 24, 2016, 4:41pm

Actually, this would be the exact opposite of the proposed repr(linear). If someone wants to write that RFC, however, I can do it at the end of this. It’s kind of trivial, though we do need to refactor how reprs work all together somewhere along the way.

bstrie · December 25, 2016, 4:45am

Tremendous yes to this. We should have this regardless of the eventual outcome of field reordering.

bstrie · December 25, 2016, 4:47am

This sounds like it might be brilliant. To be clear, are you proposing having this be the default for release builds?

bstrie · December 25, 2016, 6:01am

I think we should deliberately leave the timeline unspecified in order to respond to feedback from the community. It's unclear to me what sort of response we can expect to receive.

bstrie · December 25, 2016, 6:04am

Is there an urgent reason why you believe this is worth not waiting for the usual feature cycle? I agree that an RFC would be too heavyweight for this, but it's important to our legitimacy to give people in the community time to register feedback and concerns. Just because this is an optimization rather than a language feature doesn't mean it's immune from needing this time to bake (though I could be swayed if enough sentiment from users thinks this is ready to go immediately).

bstrie · December 25, 2016, 6:11am

I’d like to thank @camlorn again for taking on this substantial (and time-sensitive) task entirely of their own accord!

Reverting the change from beta was the right decision given the potential magnitude of this change. Even though we’ve long since decided that un-repr’d struct layout is unspecified in Rust, it’s our duty as stewards to communicate that, and if we’ve done a poor job of communication then we need time to make up that ground.

I’d also filed an issue about turning on random struct layouts in debug builds by default, found here: https://github.com/rust-lang/rust/issues/38550 .

I agree with many that having struct reordering as default would be the ideal (though of course it remains to be determined if we can feasibly achieve that without breakage), and I think phasing this in very gently will pay dividends. For example, we may want to consider turning this on by default for nightly and beta builds, but leaving it as opt-in in stable for a while until we have confidence in it. And even then we can have it as the default in debug mode first, with the default in stable coming last.

camlorn · December 25, 2016, 5:20pm

There is no urgent reason why it must be on, other than that I really want to be able to talk about it and that I feel obliged to finish it, no matter how long that takes. This project already sort of spun out of control, though admittedly the refactors I did along the way are beneficial. It’s more that I think that it’s a bit silly to make it opt in for one or more release cycles than that I have an actual concrete reason why it should go quickly that’s not me wanting to move on to other things.

I knew from day one of learning Rust that anything without repr(C) would not be safe to use with the FFI and that transmute should be avoided save for such structs. This is communicated very clearly in the book. I believe we already have warnings in place. While I get that we can’t afford to break code, I consider this breakage in the same category as transmute: fair warning has already been given. So I don’t see any need to wait 3 months or more to turn it on, not unless major issues present themselves. I expected the work to remain on nightly until the issues are ironed out, but having it be on nightly and off by default just seems like wasting time. Almost no one is going to opt in.

That said, I will abide by whatever is decided here and finish the work regardless. And I do understand that I’m biased and the least experienced person on this thread when it comes to a project as big and important as Rust.

As for the reverting PR, my goal is not to randomly break major production software. Of course I’m going to fix it as fast as I can when it’s my fault. I wanted my PR to have some time in nightly to weed out problems like this, but it happened to merge just as the beta was being made.

nikomatsakis · December 26, 2016, 11:33am

I was thinking that perhaps the write thing is to have a switch that works like this:

-Z reorder-struct-fields=all: gives all structs a #[repr] of "optimize", if not already specified
-Z reorder-struct-fields=none: gives all structs a #[repr] of "linear", if not already specified
-Z reorder-struct-fields=dump: appends to a file rustc.struct-fields.txt with the serialized def-paths of each struct/enum whose layout is not given by an explicit #[repr] attribute
-Z reorder-struct-fields=N,M: loads def-path strings rustc.struct-fields.txt, taking the subset from N..M. Structs will be be assigned to optimize only if their def-path appears in that list.

This would allow you to enable the optimization =all and then debug by doing one cargo build with =dump and then further ones that slice down the list.

I think the way to think of it is that there is no "normal" or "unspecified" repr, there are just two reprs:

optimize
linear

It probably makes sense to expose "linear" to users, though I imagine that in practice it would also be handy to have something like "prefix(N)", which guarantees only that the first N fields are linear at offset 0, but leaves the others unspecified.

I believe we have a lint for passing to C functions, no? I am not opposed to linting around transmutes, though I'd like the lint to be informed by the actual bug(s) that we find in this process.

That said, I don't think it's necessarily the case that a transmute of one struct to another is illegal: RFC 79 doesn't really specify, but I think it would make sense to say that the layout of two structs whose fields have the same types ought to be identical (even if it is not specified). It definitely seems like a new RFC is in order that can flesh out RFC 79, provide for more explicit #[repr] choices, etc.

est31 · December 26, 2016, 3:47pm

@bstrie:

It also be worthwhile adding a lint that warns if a struct without a #[repr(...)] is used in an unsafe cast, or passed to an extern function.

While it makes sense to lint for transmutes, because this invokes explicitly undefined behaviour, it makes less sense to lint if you pass a pointer to a struct with undefined repr to an extern function, as this is completely legitimate in the "callback" use case, where you pass a function pointer and a data pointer, and the function behind the pointer then gets gets called if something happens, with the data pointer being given to it as well.

Its just inconsistent IMO to say that one should revert the reorder PR because it may cause headaches for unsafe users of rust, but the same time to reject lints, because unsafe users "know what they are doing". Also, if no lint is done at all, nobody will find out about their errors, and after one release cycle nothing is gained, is there? Not everyone reads the blog, and even if they do, should they really have to manually check all of their uses of repr?

@nikomatsakis

I think it would make sense to say that the layout of two structs whose fields have the same types ought to be identical (even if it is not specified).

I don't think this follows from RFC 79 at all. E.g. this rustc issue proposes to randomize layout on debug builds, and seems to not conflict RFC 79, does it? If true randomisation is done though, fields with same types won't have identical layout.

Also, saying "these things are identical, even though its not specified" is enshrining of undefined behaviour, and I think not something that should be promoted in any way, even in unsafe code.

Topic		Replies	Views
[Pre-RFC]: Specify and stabilize drop order language design	23	3045	March 25, 2019
Concerned about Rust 2018 stability	17	4735	March 25, 2019
Contributing to Rustup - help with code structure needed tools and infrastructure	4	1053	March 25, 2019
Removal of all unstable placement features announcements	21	6080	March 25, 2019
[post] Safe pin projections through view types language design	12	1296	June 3, 2022

Rolling out (or unrolling) struct field reorderings

Related topics