Idea: Incomplete/Partial Structs (& other algebraic types)

Would it be possible to add into the type system a representation of structs where some of their fields are inaccessible for some reason: there is an outstanding reference, its value was moved out, it was never initialized, or similar?

I imagine this would be a sizeable project, touching many parts of the Rust compiler. If this seems like a feasible approach, I can try to find a professor that would be willing to let me work on this for my Master's thesis.

The general idea is that an incomplete struct has the same memory layout as the full structure, but the inaccessible fields are uninitialized memory. Some efficient way will be provided to consume an incomplete struct and make a new full structure using the same memory; hopefully through normal compiler optimizations instead of special functions. It is illegal to construct an incomplete version of a structure that implements Drop.

A reference (mutable or not) to an incomplete structure, however, is statically prevented from accessing any of the other fields. This allows for functions that only need access to some of the struct's fields to operate safely on partially-initialized structs. It can also be used for an API that allows immutable references to one field at the same time as mutations to another.

One sticking point with the previous splitting-borrow proposals has been their interactions with traits. With this sort of splitting being inherently field-based, they could not be used in traits, which don't interact with fields. With this proposal, it should be possible to define a trait impl for a partial structure. This way, the type designer has the option to make two traits that are disjoint, so that mutable references to the two traits can coexist.

Another discussion item that often comes up is whether borrow-splitting should be an implementation detail of the compiler or an explicit annotation. I am attempting to dodge this question by allowing the type system to describe what people actually want to do (and probably some kind of fully-qualified syntax for it), which leaves the door open for future type-inference solutions to do some of this automatically.

1 Like

See [Pre-RFC] Partially Initialized Types

1 Like

It seems like your biggest concern on that thread was leaking private fields in an API. Presumably, any subset of fields that the API designer wanted to use publicly would need to be defined as some kind of an associated type to the struct. That would be desirable anyway so that adding a field to one of the conceptual roles of an object could be done in one place

I also had not planned to allow multiple implementations of a single trait on different partial versions of the same structure -- if Drop is allowed at all, there will be only one implementation and it will be a compile error to drop any version of the object that doesn't have all the fields Drop requires initialized. This would be a slight relaxation from the status quo, as today you can't move out of a structure with Drop at all.

Another difference between my thinking and Soni's proposal is that I'm treating the accessible fields as completely fixed. If you own the object, you can consume it and produce another with a different set of initialized fields, but with a reference you can only downgrade to a smaller set of accessible fields, and that doesn't affect the original owned version.

I guess in some ways, I'm really proposing two separate ideas designed to work together: incomplete (partially initialized) types, owned objects that may contain uninitialized memory, and partial references: references to any object, whether or not it's fully initialized, that restrict which fields are accessible through the reference. The only real relationship between them is that a partially-initialized type can safely produce a partial reference identical to one for a fully-initialized type.

This seems similar to views, playground, but views have a combinitorial explosion. Something built into the compiler could side-step this a bit, but it will increase built times dramatically if views are used a lot.

This looks good to me.

This seems confusing to me, just be to clear you can't move out of references and leave behind uninitialized memory because we have no way to signal to the pointee that the field was moved out of. So, shrinking the view would just shrink what is accessible, i.e. going from the access set {a, b, c, d} to {a, c}.

I guess you also want disjointness checks so that disjoint views can be used at the same time, similar to disjoint fields.

Ok, I think this clears up my last comment.

This looks like it could be fleshed out.


On a side note, do you have any syntax that you want to try out. (If only to ease discussion). I'm pretty sure any syntax for views is going to be verbose so I wouldn't worry about that for now.

1 Like

I cannot yet say what I think about this from a "should we do this?" perspective... but here's some proposal-making advice: It would be good if you approached this by starting with some important guide-level code examples and then annotated them with inline comments. This way one can more easily surmise what is actually proposed (and I think your professor would be interested also). Once you have some code examples it should be easier to arrive at some informal rules that might be precursors to some actual typing rules (e.g. for your thesis).

5 Likes

Unsurprisingly, lots of new things for me to think about here, it might take me a couple of days to figure out what to make of it all. At least, I need to think more about:

  • some concrete examples, possible syntax
  • a better story about how to initialize/move out of fields while references are held to other fields
  • mixed mut and shared refs, which never crossed my mind before
  • how to let a fn initialize or move out of a field with mutable partial refs

Presumably, any subset of fields that the API designer wanted to use publicly would need to be defined as some kind of an associated type to the struct.

That sounds like a usability nightmare, whereas the rest is ridden with potential pitfalls.

Did you look at some other initiatives concerning this topic? There have been at least two more ideas brought up previously, pointers-to-fields and (I guess relatedly) partial borrows, along with others already mentioned in this thread.

This topic generated a large amount of discussion as it's massively non-trivial. I wouldn't immediately rush to implementing anything like this just yet.

Thanks for giving me more keywords to search for-- I've looked at as many previous discussions that I can find, but there's very little common terminology. There being a large amount of discussion is, in my eyes, a good indicator: people are interested in the topic.

I'm nowhere near the point where I would consider implementing anything, but my thoughts had reached the point where I needed some outside light shed on them. Even this short discussion has clarified my thinking a lot, and my original post no longer reflects my current thinking. I hope to come back with a more refined idea, with more manageable scope, in a few days. I expect that will have many problems as well, and it'll take several rounds of getting feedback and refining my ideas until there's something solid enough to be a real proposal.

5 Likes