Internal references as a separate type

clone is different from moving and is not guaranteed to be memcpy for non-Copy types. But yeah, no form of modification can occur when moving an object, and I agree it would be a bad idea to try to change that.

But that doesn't mean we can't have types with internal references. We already do, in the form of async fns. The trick is to prohibit moving altogether, which is currently done in a hacky way with Pin.

4 Likes

This may be an ignorant question, but I have long wondered if reference constraints in space (where an object resides) have ever been contemplated, in addition to lifetimes (?). Spitballing to illustrate what I mean, using @ to denote those constraints:

struct Foo {
    data1: Vec<i32>,
    data2: Vec<i32>,
    refd: & @self Vec<i32>,         // reference to something within the struct
    iter: Iter<'self, @self, i32>,  // location parameter in addition to lifetime
    outside: & @!self Vec<i32>,     // negative constraint: reference to outside Vec
}

fn alloca(size: usize) -> & @stack [u8] { ... }

unsafe fn transmute_ref<@loc, 'lt, T, U>(r: & @loc 'lt T) -> & @loc 'lt U { ... }

fn swap<@x, @y, T>(x: &mut @x T,  y: &mut @y T) where @x != @y { ... }

IOW, allowing location constraints to be passed along like lifetimes are currently. Useful pre-defined location constraints other than @self could be @heap, @stack, @tls, ...

This is of course just some vague idea, and I'm not sure if those could transitively be proven in a general case - or whether it'd be all that useful to begin with.

I'm not understanding why a relative pointer would prevent moving, without changing, the containing structure (by mem-coopy). I understand moving out of the struct being an issue, but why wouldn't that be the same as any other shared or exclusive reference in that you can't move something that has references?

If a relative pointer is stored as an offset and materialized on demand - then there's no issue. If the relative pointer is stored as an actual pointer, then moving the object would invalidate the pointer.

Aside: a variation of the idea to have space constraints exists in Chapel language. Chapel is an HPC language that allows to write a program that is executed over multiple localities (e.g. a cluster) and it has a similar syntax to specify the location of an expression. Though I doubt that it fits within Rust's scope.

https://chapel-lang.org/docs/primers/locales.html for more details

1 Like

I really don't think so. A "Relative" pointer would be +/- relative offset from the location in memory of the pointer. Moving the structure that contains the "Relative" pointer would not change the +/- value.

1 Like

Isn't this basically just C++'s pointer-to-member at that point? I wrote a proposal for it a while ago (which I can't find). It seems like we might as well just add "reference-to-field" and build this "self-reference" feature as sugar on top of that.

(Now I'm left wondering if there's a reasonable way to define "reference to enum field" at all...)

I don't see anywhere in this post or in the follow-up comments where it is made clear what problem this would solve. I think it might be more useful to start with a problem statement (with a code example) that states a clear goal of what is trying to be achieved. Then show how this proposal could solve that problem with a code example. Also, explain why no current feature can properly solve this problem. Only then can there be a sensible discussion on implementation.

1 Like

Not sure.. but let me try:

struct Node {..}

struct Edge<'a> {
    start : &'a Node,
    end : &'a Node,
}

struct Graph {
    nodes : Vec<Node>,
    edges : Vec<Edge<'nodes>>,
}

Or

struct Node<'a> {
    prev : Option<&'a Node<'a>>,
    next : Option<&'a Node<'a>>,
}

struct List {
    nodes : Vec<Node<'nodes>>,
}

@comex, are any of these the existential lifetimes you referred to? Couldn't find much talk about them..

P.S. above structures could be easily created when one knows the number of nodes in advance; otherwise they seem a bit painful to construct or extend..

Yeah... the current way I would do that is by storing the indices but that is basically runtime pointers and are annoying.

1 Like

If List is a Vec of Nodes, why would you create a linked list of references rather than simply storing List-array indices in the prev and next fields of each Node?

The only time i've really thought about existential lifetimes is within the context of a generalization of streaming iterator where you can iterate over objects in batches.

Where streaming iterator the values lifetime is limited to the next iteration. In theory at least with existential lifetimes you could have a lifetime which is less than the iterators lifetime, but greater than the Iterator::Item lifetime. Presumably this requires an additional function to get at one of these marker objects, before a sequence of calls to next.

My GAT foo isn't strong enough to have attempted implement it, but it's the only thing that has caused me to think about existential lifetimes before.

I don't see any difference in the example given and the use of indexes into the vector of nodes. I can't see any benefit of this proposal from the example given.

It wouldn't. There's some crosstalk going on here. Basically, @Tom-Phinney suggested that in addition to relative pointers, another viable option was having real pointers which the compiler would relocate on move. This was shot down by others; in Tom's reply, he accepted it wasn't an option, then went on directly to dismissing the need to support internal references at all. I (perhaps overeagerly) interpreted that juxtaposition as implying that relocation-on-move was required to support internal references, so I responded pointing out the alternative of disallowing moves.

Anyway, relative pointers are a valid option that preserve the ability to move; they're just extremely limited, because both the pointer and the pointee must be directly within the same buffer, without any indirection on either side. You can't have e.g. a Vec of relative pointers, or a relative pointer into a Vec (unless the pointer is in the same Vec). That's why I think other approaches to self-references are more promising.

1 Like

I didn't dismiss the need. I've seen a number of cases where a struct header has an array index or field offset to the "next" field of the struct to be processed. However, I've never encountered a case where the lifetime of the pointee differed from the lifetime of the containing structure. My reply (post 36) to post 34 in this thread attempts to make that point: the suggested need for internal references within a struct is satisfied with less space, and probably less cache thrashing, by storing and using indices rather than usize addresses.

Incidentally, that was not my suggestion; it was a reply to post 18 up-thread, where that alternative was suggested.

Well, to the extent I misinterpreted your posts, I apologize. I know relocation wasn't originally your suggestion, but you argued it was possible ("I don't see why the past has to dictate the future") and had some potential benefits ("less impactful to code generation").

Yes, that's the kind of thing I'm envisioning. In fact, you can translate the first example pretty straightforwardly to rental, and what I call 'existential lifetimes' largely boils down to "rental but without the closures".

That said, as you pointed out, there are some fundamental limitations to what this approach can accomplish. In your example:

struct Graph {
    nodes : Vec<Node>,
    edges : Vec<Edge<'nodes>>,
}

This can be safe only if nodes is immutable as long as edges exists. After all, if you remove a node, there's no way to prove to the compiler that there aren't any edges left pointing to it. Even if you add a node, Vec could reallocate and invalidate the node pointers, though this limitation can be avoided if you use a true arena data structure rather than Vec.

So arguably, there's not much benefit to using self-references here, as opposed to just storing array indices, which would let you freely mutate both arrays.

Or instead of storing indices (which carry no type safely with them) some form of type based areana pointer that can "invalidate itself" when what it is pointing to goes out of scope. Sort of like Rc without the double heap allocation (and only one "strong" reference)

Indeed, recently i've created a crate called repository for my own use.

Using it, this example can be written as

struct Graph {
     repo: Repo,
     nodes: Vec<EntityPtr<Node, Graph>>,
     edges: Vec<EntityPtr<Edge, Graph>>,
}

struct Node {..}
struct Edge {
      start: EntityPtr<Node, Graph>,
      end: EntityPtr<Node, Graph>,
}

And it seems without bothering with lifetimes, it can work quite smoothly as well.

EDIT: Fixed the link!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.