It’s no secret that the “unsafe code guidelines” process has been kind of stalled out lately! I’d like to try and get things going. Honestly I think what we need most right now is a certain amount of legwork – or, more specifically, research work.
Before anything else in this message, then, I want to propose that we organize a synchronous chat to try and get organized (either over IRC or as a voice/video chat). Specifically, I want to try to avoid wrangling about details of the rules, and focus on how we can start to make slow but steady work on the research that lies before us.
As I once proposed in an earlier thread, I think some synchronous meetings may be helpful. To that end, I have a doodle poll with a few possible times next week; please fill it out if you are interested in participating. I don’t feel this has to be limited to people on the unsafe code guidelines strike team necessarily. However, if you do sign up, I view it as a (tentative) commitment to be involved and meet on a regular basis (but not every week or anything like that).
What do I mean by research?
It seems to me that we still have a fair amount of basic questions that we haven’t really answered, though we’ve investigated a good deal:
- What kinds of optimizations does the compiler currently do?
- What kinds of unsafe code do people write in practice?
- What kinds of optimizations do we want to enable and what do they require?
This is what we have been trying to do with the rust-memory-model repository, and there is indeed a lot of good information on there. But I think there are some flaws, too:
- The issue format is very disjointed. It’s too hard to go over and read those threads and extract information.
- There is no “greater narrative”; no way to easily get up to speed on some of the crucial questions and tradeoffs.
To some extent, this is the usual problem we always have. We need people to “summarize”. I feel like a good format for the repository might be that issues exist to track open questions, or requests for information, and that we use files committed in the repo to track summaries and collect results. There are existing files that are along these lines; we could maybe build on that format, or find another. Ideally, we can have files for individual questions, and then larger reports that summarize the summaries and highlight the most important questions.
So I see a lot of work to organize here:
- Making sense of the existing content on the rust-memory-model repository
- Investigating real-world code
- Might better to invest in a tool like this Datalog idea that I was floating
- Documenting the cases in which the compiler will currently supply annotations and hints to LLVM and our best guess at what they actually mean
- Documenting some of the common needs of optimizations (I’d probably start by trying to mine various research papers or articles here)
Deciding on principles
This is somewhat in constrast with the previous point, but it is probably reasonable to try and put forward and debate some “guiding principles” (though I’d rather do this a bit more async, I suspect, so as to have a broader and more reasoned discussion). I feel like we’ve never really tried do this. I wouldn’t consider any agreement here to be truly meaningful, but it may be helpful for guiding other work.
For example, I am still pretty enamored of trying to achieve an executable specification for when “undefined behavior” occurs, at least to the extent that we can (this is presuming of course that we want to have the concept of UB, but it seems like that ship has sailed, to some extent, thanks to us using LLVM and so forth). To that end, I’d love it if the examples and things we come up when doing research can be fully formed Rust programs that execute to produce (or not produce, as the case may be) undefined behavior. As a bonus, once we are making progress towards an actual validator, this would allow us to test the models by running them across the tests.
But there are some other big questions before us that I would like to make agreement on:
- Should we distinguish “safe” vs “unsafe” code in the rules in any way?
- Well, I’m not sure what others just now, but probably more. =)
That’s it.
OK, those are my thoughts. What do people think? Is this the right way to make progress? Wrong-headed?