I've been looking at the Rust memory model lately; I was thinking about some non-trivial data structures (eg linked lists, trees with parent references, etc), and I wondered if there's a body of research on the "Ownership" part of the "Ownership-Borrowing" model, similar to Stacked Borrows.
I've skimmed the Rustonomicon and other resources, and it seems that the general rule is something like:
A variable has ownership on the data it points to until it's moved; ownership must be a tree with no back-edges or DAG-like sharing. Exceptions, if any are managed by fiat in unsafe code as long as they respect aliasing rules (aka stacked borrows).
Has there been some discussion on the subject of nailing down ownership semantics, an how exotic cases should be handled? Some exotic cases include:
The last few are especially interesting to me, because it seems that with async/await and coroutines (and pinning, which has its own guarantees it needs to obey) we can create data structures that have "shapes" that don't really exist in the normal type system.
Most obviously RustBelt: Securing the Foundations of the Rust
Programming Language formalized a "realistic subset" of Rust that includes the basic ownership system and its relationship with unsafe code. This was kind of the big "Rust has formal proofs now!" thing before Stacked Borrows took it to the next level (and more recently/quietly, RustBelt Meets Relaxed Memory). I haven't seen them do anything specifically about the allocators/data structures you mention, though I suspect at a formal level those are all "just" use cases of the aliasing model (afaik they exist in Rust crates today without any existential crises about whether they could ever possibly be sound).
But I don't know much about non-Rust research; someone else will have to fill that in.
I might have to re-read the RustBelt paper, then. Is there some vulgarization work available? The paper itself gets into mathematical jargon (and notations) that I really don't understand.
Also, I'm not sure this is quite what I had in mind. From your description, I assume RustBelt still follows the "ownership is a tree with no backedge" rule, and doesn't provide guidelines for code violating that rule.
(afaik they exist in Rust crates today without any existential crises about whether they could ever possibly be sound).
Yeah, but that's what I meant when I said "by fiat".
For instance, generator stacks violate tree-ness: you can use generators to create what is essentially a DAG of references, by writing safe code, which is manipulated by unsafe code; but that unsafe code doesn't really have formal rules to obey, besides the Pin guarantees (don't ever move that data), and the fact that by design it's can't "observe" the DAG because it can't look inside the generator stack frames.
I'm not really explaining myself well here. My point is, there's a thing here that falls kind of outside habitual Rust rules, that is made safe through extremely restrictive rules that limit how that thing can be used. I'm thinking that it would be really useful to explore that design space, and I wonder how it could be applied to improve ownership rules in safe code.
Ownership in Rust is entirely a type system fiction. If you only look at the language and its operational semantics ("what happens when you run Rust code"), then there is no ownership. The only thing that exists on that level is the "aliasing model" (of which Stacked Borrows is the most worked-out proposal). Another way to say this it to note that Miri knows nothing about ownership.
The "memory model" of Rust (at least in the way I know that term) likewise has nothing to do with ownership. A "memory model" answers the question: when you read from address a, what are the possible values that you can see? With concurrency, that is a non-trivial question. Rust uses the C++ memory model for that. The sequential part of Rust's memory model is unfortunately not specified very much; the only thing I am aware of is my write-up of Rust's "memory interface". This is not a memory model, this just defines more precisely the interface that a memory model "implements". In Rust, I envision this interface to be fairly simple, directly exposing a bunch of bytes for each allocated block. (This is very unlike C/C++ where memory is typed and addressed through logical "paths" besides addresses.) The main complication here is that in Rust/C/C++, bytes are weird. But this also means that the memory model is very simple aside from concurrency: it just stores those bytes, and retrieves them on a read.
So, coming back to your question... I'm afraid I don't understand what you are asking. Ownership in Rust is basically whatever you make it. When you define your own type, you get to define its datastructure invariants, and the ownership represented by that type is part of said invariant. (This is probably the hardest thing to grok here. It is based on "separation logic", an ingenious approach that lets you talk about ownership in the same way that we usually talk about normal mathematical properties.) You need to ensure that the invariant and its ownership are properly maintained by all public functions, of course, but that's about it. Types like Rc/Arc (mentioned above) demonstrate that this mechanism indeed lets you stretch the notion of "ownership" quite far. Unless you are using primitive types such as Box, &mut, &, you can do whatever you want inside your data structure, as long as a safe client cannot cause UB. If you are using primitive types, you have to follow their rules (such as aliasing rules).
Maybe it would help if you had a concrete, small example?
Thanks for the shout-outs.
I should also note that Stacked Borrows and the RustBelt paper are mostly orthogonal. The RustBelt paper is about verifying that unsafe libraries are correct; Stacked Borrows is about verifying that optimizations are correct. So Stacked Borrows is not "RustBelt++", it is an orthogonal research direction.