Proposal: reboot the Unsafe Code Guidelines team as a Working Group

I’d like to see some progress towards a workable set of Unsafe Code Guidelines. To that end, I have a proposal: I’d like to “reformulate” the Unsafe Code Guidelines effort as a “working group”.

Unlike before, I’d like to drive the process not by starting with discussion, but rather by starting on a draft of an actual guidelines document. The initial focus would be trying to come up with a kind of “table of contents”, basically talking about different questions that arise and trying to categorize them; we would then have people take a shot at drafting each section.

The initial goal would not be to make final decisions. Rather, when we write-up a question, we would trying to document the constraints at play. We would give examples of code that exists, complications that arise, and so forth. In some cases, where things seem relatively clear, we might be able to make firm statements about what is and what is not allowed, but in other cases we can settle for documenting the range of future possibilities.

Examples of things I would imagine we would cover:

  • We should discuss memory layout of types.
    • We know, for example, that one cannot rely on the layout of a repr(rust) types – except when you can! Let’s see what we can document here.
  • Try to document the range of things unsafe code (and C code) does, and find dubious ones.
    • Example: we know it is UB to kill a thread and deallocate its stack without running destructors; it would destroy rayon etc.
    • However, we don’t quite know the conditions when longjmp would be ok – clearly it is done sometimes in C APIs.
    • I’m also not sure what the limits might be around things like signal handling
  • And yes, we can dig a bit into the aliasing rules too, though that’s worthy of a separate post (or 200).

Procedurally, I’d like to mostly work async of course, but with bi-weekly sync meetings (every 2 weeks). For the meetings, we would look over what’s been done, exchange questions, and try to set some goals for the next couple of weeks. Since I’ll be away for the next week, I propose we do our first meeting sometime after that. =)

Speaking personally, I do not have much time to devote to this, so I’d appreciate any help from others in terms of organization etc! But I want to see progress, and I think that having even infrequent – but regular! – meetings ought to enable us to start getting traction here.

I’d also like to make sure we draw in a wide variety of folks where appropriate, both from industry and academia; I think that having a regular meeting or functioning working group should be helpful.

Thoughts?

16 Likes

Another topics that has come up lately:

  • Unions and how to think about them
    • e.g., can we say anything about the bits inside
    • can they ever have “niches” for layout purposes
  • How that interacts with uninitialized memory or “volatile” memory
1 Like

cc @RalfJung @asajeffrey, with whom I talked about this at the All Hands – and others too, but I can’t remember everyone. =)

I like it. :slight_smile: I think we may actually have more than one document to write then. The guidelines document seems to mostly consist of interesting cases in a long list; I’d also be interested to start writing a document that actually specifies the behavior of MIR using some kind of abstract machine. I think this should entirely abstract away details like memory layout, which is a point where it would interact closely with the guidelines document.

And it seems like unions should be pretty high up on our agenda, we currently have two PRs kind-of stalled on issues around “what is the basic bit pattern validity criterion for a union”: unsized unions, with the goal of enabling ManuallyDrop for unsized types, and the conflicting PR to add an empty variant to ManuallyDrop. And then unions with Drop types are generally a mess, though that doesn’t look like an unsafe code guidelines problem. However, shouldn’t such a discussion involve some union stakeholders, like people doing FFI? Or do you think we’d just collect the consequences of various design decisions and then make that available to an RFC discussion or so?

That sounds quite reasonable.

Yes!

I've been pretty busy since getting back this week, shall we try to plan a first meeting for next week perhaps?

Maybe we should also create a gitter channel too. It seems like we've found overall that holding meetings over chat is often more approachable for most folks, and it'd be nice to have a shared, persistent place to talk.

Yeah, sounds like a good idea. Also could we have a GitHub team so people can summon us into issues?

Yeah,

How much experience would be required among working group members? I’m been building stuff with unsafe Rust for a couple of months but am still figuring things out :slight_smile:

I want to apologize for the radio silence here. It’s been a busy time! I wanted to post a quick update with the current status (there’s a list of work items I want feedback on below):

First off, I’ve created a GitHub team called WG-unsafe-code-guidelines. The main role of this team is so that people can write cc @WG-unsafe-code-guidelines whenever they encounter some kind of thorny situation and a reasonable set of people will be cc’d. Being on this GitHub team does not imply any particular responsibility or authority – just a desire to be informed.

Currently, the team consists of myself and @RalfJung – if you would like to be on it, or you think you know someone who should be on it, just ask. =) Some likely candidates might be @eternaleye, @arielb1, @ubsan…? Obviously this is not a complete list.

Second, I’ve also created a GitHub repository rust-rfcs/unsafe-code-guidelines to serve as the home for this new UCG effort. As I initially proposed in this thread, this repository houses a mdbook called the “reference”. This is meant to organize some of the thornier unsafe code questions that we are looking at and to collect reliable information and advice. It will eventually record the consensus and be a true reference, though I always expect it to be a kind of “living document”.

Finally, I’ve been talking a lot about how to actually organize this effort and get it off the ground. In particular, I’ve talked with @avadacatavra, whom I hope to corral into playing a kind of organizational role. That said, I think that helping to organize and manage this group would be a great way for people to get involved, so that sounds like something you’d like to help out with, please do let me know!

I believe that the rough plan should be like this:

  • We hold a meeting every N weeks, where N is probably 2 or 4. Probably over Zulip or Discord but maybe over some voice channel instead for higher bandwidth.
  • The topic of that meeting is decided in advance: it should be one of the “subsections” in the reference.
    • We nominate someone to “lead” the meeting. Their job is to do research on the topic and prepare an agenda of complex questions.
    • We create a thread specific to this meeting on internals – or maybe open up an issue on the GH repository?
    • Leader can post details and links into that thread as well as summaries
      • people can provide feedback if anything is missing
  • Goal of the meeting is to produce answers for the major questions
    • Often, especially to start, definitive answers will not be possible.
    • In that case, goal would be to produce a definitive SUMMARY of the key constraints and examples.
  • After meeting, the leader is responsible for:
    • updating the reference with a summary of the major points raised and/or consensus
    • perhaps adding detailed minutes or other notes elsewhere in the repo as needed
  • Rinse and repeat as needed.

Work items:

Here are some things I would like feedback on.

  • Are there topics missing from draft table of contents in the UCG reference?
  • Are you interested in helping to organize meetings and other events?
  • Are you interested in being added to WG-unsafe-code-guidelines alias?
  • Does my plan for meetings sound reasonable?
  • Nominations for first meeting topic?
    • I was thinking we could start with a kind of “softball” (ha) and talk about data structure representation. What kinds of things can people rely on?
  • What schedule should we use for meetings?
    • In particular, it might be useful to wait to start meetings until after Edition Preview 2, at least, is done.
  • Any other thoughts?
5 Likes

@vignesh

To be honest, I would like a range of experience levels -- it is very useful sometimes to have the perspective of people who are a bit less "down in the trenches" when it comes to this stuff. I think enthusiasm and a desire to find answers constructively is most important. Sorry for the slow response!!

I suggest asking @alercah due to this very recent post.

2 Likes

I’m happy to join in, especially if there is likely to also be language development (e.g. relaxing requirements inside modules with unsafe, "trusted impl", etc.) coming out of it.

1 Like

Now that I think about it, I think that since this relates to the Language Reference, I should definitely be at least on top of things.

1 Like

:wave:

1 Like

In general, Rust makes few guarantees about memory layout, unless you define your structs as #[repr(rust)].

Was this supposed to say #[repr(C)]?

2 Likes

Should invalid representations (e.g. out of range enums) be on the list too? Or we happy leaving the rule that they must never ever exist in place?

After some discussion in private, I think that the actual semantics around invalid representations require enough finesse that they ought to be discussed.

I’m happy to help out in an organizational role–@nikomatsakis and I have had some chats about this WG and I have a lot of thoughts.

I’ve also been working on a writeup of the actix unsafe code usage, which should be done this week. So far, I’ve found it illuminating to look at how people actually think about and use unsafe code currently, and how that might influence this WG.

4 Likes

Sounds very interesting!

I'd be very interested in being added and helping work on unsafe code guidelines.

1 Like