Immovable types and self-referencing structs


#1

Immovable types and self-referencing structs

Original RFC by Zoxc

I’m writing this because discussion has stagnated a bit, and I think that some things are missing from the RFC (namely interaction with placement-new). The only thing from the RFC that I really disagree with from the original RFC is MovableCell, which as far as I can tell cannot be made safe.

Use cases

As far as I can see, there are 3 use cases for immovable types:

  • Self-referencing structs (and more specifically, generators!). Can never be moved after construction.
  • Structs whose address is observed by FFI functions (Less strict than self-referencing structs)
    • There is no problem with moving this type of struct until a reference is taken (explained in the original rfc). Self referencing structs are a subset of this requirement - they are simply referenced on creation.
    • I might neglect this use case in this document, because it has the least strict requirements.
  • Unleakable structs (explanation in its own section)

Feature interaction - placement-new

There is heavy interaction between these two features. Everything return value and function argument in rust is moved (this is more relevant to return values). If placement-new moves the value behind the scenes, it needs to be changed.

Immovable types can be un-returnable as a start - just place-able. This means that in order to let an immovable type escape a function, it needs to take a Place parameter. (See [Place on return][#Place on return]).

Details

  • Add a new auto-trait* Move, which is implemented for every struct, except self-referencing ones. It can also be opted out of.
  • A Move trait bound is always implied.
  • References and pointers are always Move.

(A little lacking)

Destruction (self-referencing structs)

Haven’t thought about this much. Drop order matters. Might need an intrinsic in order to allow early-dropping immovable structs (mem::drop moves the value).

Ergonomics

Immovable structs can be implemented with bad ergonomics - I think if is better to focus on soundness and worry about ergonomics later. However, there is no reason for me to keep my thoughts about this to myself.

Construction

Currently, there is no way to construct a self-referencing struct. This isn’t as important as it may seem, since the construction can be done unsafely - once the struct is complete, it is safe to use.

TODO: Syntax proposal.

Place on return

Being unable to return immovable structs isn’t very ergonomic. This can happen behind the scenes - functions returning immovable types can be transformed to receive a Place parameter. Call sites must either be a local variable binding ([Stack place][#Stack place]), or a placement-in expression (or is it a statement?).

Supporting features

These are features that could work well in combination with immovable types.

Stack place

A Place that is implicitly created and passed to functions which return immovable types. I’m not sure if this is a good idea, or how hard it is to implement.

ExplicitMove trait

Sometimes you might want to move an immovable struct after it has been constructed.

Functionally like CPP’s move constructor. It is implemented for every struct whose members implement ExplicitMove. Unlike CPP, never called implicitly.

A little dangerous to add, since people might add this bound everywhere. If rust could do without immovable structs, it can do without immovable structs that can be moved after construction.

Unleakable structs

We can use this feature to create types which cannot be leaked by safe code (as far as I can tell). If we add a StackOnly: !Move trait, which marks the struct as only place-able in a StackPlace, it can not be moved from the stack, and its destructor can only be called at/before scope exit (before == with drop intrinsic).

Appendix

I don’t know how to properly format this section in markdown, this will do for now.

* I don't know the correct term for this type of thing - I would be glad if someone referred me to an explanation/explained it to me.

#2

Edit: ah sorry, you can’t return them because you want to support ‘returning’ self-referential structs. Disregard the below.

IIRC, according to the design of the RFC you can return them, as long as you haven’t taken a reference to the struct after it was created in the function. This is the point of the section you (I think?) commented on at https://github.com/rust-lang/rfcs/pull/1858#discussion_r161277199

Per above, you can return an immovable type. This is pretty fortunate because the placement-new may need some redesign and making placement-new a dependency of immovable types will possibly lead to slow implementation - the immovable types RFC is a year old tomorrow, whereas placement has a web of related RFCs in various forms since 2014 (the first being uninit pointers). See also my comment on placement-new, though it’s probably not worth getting into details here.

Edit: I sympathise with wanting to be careful about the interaction with placement-new after rereading your post. But I stand by my caution about waiting for placement new.


#3

Is the intention to also support such structs using generics (i.e., without having to explicitly construct a self-referential type) like owning_ref::OwningHandle. For example, I’d love to be able to do something like:

let stmt = Box::new(mysql::Conn::new(...))
  .into_self_ref(|conn| conn.prepare("...").unwrap());

where in hypothetical syntax:

impl<T> Box<T> {
  fn into_self_ref<F, O>(self, map: F) -> MagicSelfRefBox<T, O>
  where
    F: for<'a> FnOnce(&'a mut T) -> O + 'a {
    // magic goes here
  }
}

and where MagicSelfRefBox: Deref<Target = O> + DerefMut so that the returned stmt can be used as a self-contained prepared statement that owns its own connection.

Obviously, this won’t work just like written above. The for bound won’t work out, and all sorts of other invariants will likely have to be maintained (see for example this open issue on owning_ref). But it would be fantastic to have a convenience function like this!


#4

Unfortunately, this isn’t the case. Whatever lifetime is used to express the self-referentiality of one struct will not be uniquely tied to that specific instance, and can be mimicked by another self-ref struct in the same scope. Their references will be able to intermingle, despite the true lifetimes of the structs ultimately being different.

It’s possible this could be remedied with language support for generative existential lifetimes, but that’s a whole separate feature. Rental currently uses a trick involving HRTB bounded closures to produce the required existential lifetimes, and is the only reason it is safe to use. Immovable types alone don’t make self-ref structs any safer, either to construct initially or even after construction.


#5

You can only return it if it is not already borrowed, which self-referencing structs always are. The ability to move an immovable struct is only useful in the second listed use case:

Also, being able to return immovable types is an ergonomics issue (albeit a major one), while having completely immovable types solves a memory safety problem.

Regarding placement-new, I concede that making these features inter-dependent from the start isn’t the right way to go. I’ll make a more conservative suggestion:

While writing the following proposal, I realized that it can’t work for un-nameable types (like generators) but I’ll keep it here in case someone else has an idea.

Immovable types can only be constructed from literals, and placed either on the stack or at a mutable pointer:

#[immovable]
struct B;

let b = B{}; // ok

fn a(b: *mut B) {
    unsafe {
        *b = B{}; // also ok
    }
}

fn a() -> B {
    B{} // not ok for self-referencing structs
}

This way, a macro can be written to make this a little nicer to use, and any interaction with placement-new is greatly simplified.

Regarding unsafe construction of self-referencing structs: I talked with @jpernst about this, and now understand that currently, borrow-checked self-referencing structs cannot be simply expressed in rust (construction is not the only problem) - that’s what rental is for :slight_smile:. I think that non-borrow-checked self referencing structs can still be useful (using pointers for the self-references).

@jonhoo After my talk with @jpernst, I realized that safe self-referencing structs is a lot bigger of an issue than I thought, and I would like to reduce the scope of this proposal to exclude them (I’m considering changing the title to “Immovable types and minimal self-referencing structs”)

About generators

Since self-referencing structs cannot be simply expressed, I think that immovable generator’s stack values should just be a ‘bag of bytes’ type wise, and transumted on resume. The arguments should be saved in the generated struct as-is. Generators should still be borrow checked before being transformed.

If the movable-until-observed feature is to be accepted, then they would have to somehow manually trigger it. If stack variable types are to be erased from the type, the compiler should do something to align it.


#6

I’ve had this idea which is similar to the ExplicitMove trait but is more restrictive (The RefMove trait must be implemented when the struct contains field with ref-lifetimes) and is called implicitly. The idea is to apply default move semantics first, but fields which have a ref-lifetime are in an uninitialized state. They must be initialized by the method specified by the RefMove trait.