In Rust there are currently two ways to make sure that an object A
doesn’t outlive an object B
:
- Add a lifetime parameter to
A
and store a reference toB
in it:struct A<'a> { parent: &'a B }
. - Store a
Rc<B>
or anArc<B>
in A:struct A { parent: Arc<B> }
.
The first method is obviously the best: it is not only faster, but also doesn’t force the user to use an Rc
or an Arc
.
However, this first method has a big drawback: you can’t store the A
and the B
in the same struct, making the A
object unusable in many situations.
This hasn’t really been a problem so far because objects in the standard library which depend on other objects (like A
and B
above) are only temporary objects. Iterator
s, MutexGuard
s, Ref
, etc. are all temporaries and don’t last long.
But if you want a child object that lasts longer but still shouldn’t outlive its parent, you will very quickly face this issue.
Unfortunately this will probably lead library writers to use the second method instead of the first one, which leads to lower performances, even though there shouldn’t be any technical limitation.
Technically
Technically, when you store an object in a Box
, in an Rc
or in an Arc
, the object will stay at the same memory location even if you move its containing struct.
So there is no reason why you shouldn’t be able to write this as long as you don’t modify the object and as long as you destroy child
before container
:
struct Foo {
container: Box<A>,
child: &'lifetime? A,
}
If you never modify a Foo
object, the A
object contained in it will always stay at the same memory location, and the child
member can safely point to the A
.
You can even write this if you also ensure that Foo
is never moved:
struct Foo {
container: A,
child: &'lifetime? A,
}
There are some limitations to writing this kind of code:
- The
container
member of the struct must never be modified. You must not be able to write neitherc.container.mut_something()
norc.container = ...
- If
Foo
can be moved, then the container must be aBox
, anArc
, anRc
, or any other object whose content doesn’t change memory location whenFoo
is moved. -
child
must be destroyed beforecontainer
.
Is there a nice syntax?
The problem is that I didn’t come up with any real idea about how to make this work.
Here are some ideas that I think should help in making this possible:
-
Add a
const
modifier that can be applied to struct fields. Aconst
field can never be modified after initialization. -
Add a special
'self
lifetime that can only be used within structures. For example:struct C { container: const Box<B>, child: &'self A }
orstruct C { container: const Box<B>, child: A<'self> }
. -
Add a new opt-out trait named
Move
which indicate objects that can be moved. It is similar toSend
but much more restrictive. -
Specify clearly in the language the order in which the elements of a struct are created and destroyed. The order of creation should be the order of declaration, and the order of destruction the inverse (just like in C++). This is to make sure that the referencer is destroyed before the referenced.
-
During the initialization of a struct, each expression can access the fields that are constructed before.
But then I’m stuck. What is 'self
exactly? How would rust now what can be assigned to a &'self
? What is the relations between const
and Box
/Arc
/Rc
with 'self
?
My experiment
A quick note to say that I have been experimenting with a syntax extension that allows one to write self-referencing structs: https://github.com/tomaka/rust-self-ref
I’m currently facing some issues related to hygiene (during initialization I can’t manage to reference at the same time the structure itself and the surroundings), but apart from this the extension works.
A code that looks like this:
#[self_referencing]
struct MyStruct {
a: Vec<int>,
b: &'self_ref int
}
Will be turned into:
struct MyStruct {
a: Box<Vec<int>>,
b: Option<&'static int>
}
And this code:
self_ref_new!(MyStruct {
a: vec![0, 1, 2, 3],
b: self_ref._get_a().as_slice().get(0).unwrap()
});
Is turned into:
{
let self_ref = MyStruct {
a: box vec![0, 1, 2, 3],
b: None
};
self_ref.b = Some(self_ref._get_a().as_slice().get(0).unwrap());
self_ref
}
This is obviously very hacky, but is a small proof-of-concept that it should be possible.
Why open this topic?
I’d like to raise awareness that this lack of self-referencing structs will probably become a real issue in the future, when higher-level APIs are created.
I don’t have a solution myself, but my opinion (coming from a C++ background) is that self-referencing structs and associated items are the only two majors features that are lacking in rust right now.