Proposal: reboot the Unsafe Code Guidelines team as a Working Group


#1

I’d like to see some progress towards a workable set of Unsafe Code Guidelines. To that end, I have a proposal: I’d like to “reformulate” the Unsafe Code Guidelines effort as a “working group”.

Unlike before, I’d like to drive the process not by starting with discussion, but rather by starting on a draft of an actual guidelines document. The initial focus would be trying to come up with a kind of “table of contents”, basically talking about different questions that arise and trying to categorize them; we would then have people take a shot at drafting each section.

The initial goal would not be to make final decisions. Rather, when we write-up a question, we would trying to document the constraints at play. We would give examples of code that exists, complications that arise, and so forth. In some cases, where things seem relatively clear, we might be able to make firm statements about what is and what is not allowed, but in other cases we can settle for documenting the range of future possibilities.

Examples of things I would imagine we would cover:

  • We should discuss memory layout of types.
    • We know, for example, that one cannot rely on the layout of a repr(rust) types – except when you can! Let’s see what we can document here.
  • Try to document the range of things unsafe code (and C code) does, and find dubious ones.
    • Example: we know it is UB to kill a thread and deallocate its stack without running destructors; it would destroy rayon etc.
    • However, we don’t quite know the conditions when longjmp would be ok – clearly it is done sometimes in C APIs.
    • I’m also not sure what the limits might be around things like signal handling
  • And yes, we can dig a bit into the aliasing rules too, though that’s worthy of a separate post (or 200).

Procedurally, I’d like to mostly work async of course, but with bi-weekly sync meetings (every 2 weeks). For the meetings, we would look over what’s been done, exchange questions, and try to set some goals for the next couple of weeks. Since I’ll be away for the next week, I propose we do our first meeting sometime after that. =)

Speaking personally, I do not have much time to devote to this, so I’d appreciate any help from others in terms of organization etc! But I want to see progress, and I think that having even infrequent – but regular! – meetings ought to enable us to start getting traction here.

I’d also like to make sure we draw in a wide variety of folks where appropriate, both from industry and academia; I think that having a regular meeting or functioning working group should be helpful.

Thoughts?


#2

Another topics that has come up lately:

  • Unions and how to think about them
    • e.g., can we say anything about the bits inside
    • can they ever have “niches” for layout purposes
  • How that interacts with uninitialized memory or “volatile” memory

#3

cc @RalfJung @asajeffrey, with whom I talked about this at the All Hands – and others too, but I can’t remember everyone. =)


#4

I like it. :slight_smile: I think we may actually have more than one document to write then. The guidelines document seems to mostly consist of interesting cases in a long list; I’d also be interested to start writing a document that actually specifies the behavior of MIR using some kind of abstract machine. I think this should entirely abstract away details like memory layout, which is a point where it would interact closely with the guidelines document.

And it seems like unions should be pretty high up on our agenda, we currently have two PRs kind-of stalled on issues around “what is the basic bit pattern validity criterion for a union”: unsized unions, with the goal of enabling ManuallyDrop for unsized types, and the conflicting PR to add an empty variant to ManuallyDrop. And then unions with Drop types are generally a mess, though that doesn’t look like an unsafe code guidelines problem. However, shouldn’t such a discussion involve some union stakeholders, like people doing FFI? Or do you think we’d just collect the consequences of various design decisions and then make that available to an RFC discussion or so?