There can be no references simultaneously with a mutable reference. But this is far too restrictive:
There is a frequent middle case like pushing onto a Vec. Any references (even mutable) to earlier parts of the vector would not be affected. Thus they could coexist. Same goes for inserting new keys into a HashMap. Only simultaneous references to the whole structure/container would be incompatible.
I suggest a separate keyword to denote this, like fn push(&grow self, value: T). This would be a guarantee that the method doesn’t alter existing elements or their keys/indeces.
This would apply recursively to all members of a struct. E.g. I need to access data both by index and key. So I have a struct of Vec<(Key, T)> and HashMap<Key, usize>. If a reference to this struct is &grow, that would imply that I may only call &grow methods on the members.
And such a reference could only be taken from a mut or &grow.
Pushing onto a Vec could cause the entire collection to be reallocated, thus invalidating all references to other parts of the slice. Same goes for inserting into a HashMap.
Another complication is that something like Vec::<T>::get(&'a self, ix: usize) -> Option<&'a T> is interpreted by the borrow checker in such a way that the &'a T reference to an item is essentially the same as the borrow &'a self of the whole Vec. So with regards to this statement
Only simultaneous references to the whole structure/container would be incompatible.
there is currently no meaningful distinction between “references to the whole structure/container” and “references only to items inside the structure/container, not to the whole structure/container”, at least as soon as any function boundaries are involved. (Borrow checking works locally and would analyze a call to Vec::getonly based on its signature.)
So one would simultaneously need to introduce some new way to mark Vec::get, too.
At least for something like Vec::push_within_capacity, reallocations could be ignored though. Much more niche use-case, but it’s at least something to discuss without needing to worry about other approaches to tackle the reallocation issue.
(Regarding tackling the reallocation issue, in principle, at least references into the items, behind additional levels of indirections should still be unaffected even for ordinary Vec::push. I mean a reference to a T in a Vec<Box<T>> for example.)
Also for linked lists this would then work. (Even though fixing the links is technically changing the old part of the list, so would maybe need to be done unsafely.)
Interesting data type. Alas it doesn’t allow access other than through a mut iterator. So though it solves the memory no go, it still doesn’t give the desired feature.
The idea with typed-arena is that the alloc method gives back a &mut T to the pushed item, which is supposed to serve as the main way to get (and keep) access to the item. That’s alloc(&'a self, T) -> &'a mut T (notably &'a self is an immutable borrow, so this returned &'a mut T does not conflict with any further immutable borrows of the same arena).
I had read about Box<str> (which still mystifies me a bit, as I had originally misunderstood that I can not have a non-static str.)
I was struggling badly with lifetimes here. If I have a reference in a struct, I must put the same lifetime on both, meaning the struct can only live as long as whatever outer thingy its inside is referencing. But here the reference is from one inner container to another while the struct’s lifetime should not be restricted. I couldn’t figure out how to express that. Something like:
struct X {
a: Vec<Box<str<'a>>>,
b: Vec<&'a str>,
}
Now to get around not being able to reference into my container – on a whim I tried Rc<str>. To my delight it works great, AFAICT totally transparent! And it takes only 16 bytes of overhead, 8 bytes less than String.
That's what's called a "self-referencial struct". Rust doesn't have a useful way to express that with lifetimes. There are a few crates that try to make it possible to create useful self-referencial structs; they do so with unsafe and it is very hard to do soundly. Most or all such crates have at least some niche unsound use cases.
And for the correctness or soundness, please checkout my post:
@steffahn gave some suggestions, inspiring me to apply the design pattern I recently found (I haven't found a good enough name for it) to meet your requirements.
Hey Tenny, thanks for this! It indeed looks like what I was asking for. I haven’t thought about it long enough yet, to really get it.
Two big limitations are the unsafe, which I hardly want in my code, and the within capacity. I totally get that part, but it’s no fit for my initially unknown data size.
While the original post's argument is flawed, I think there is a kernel of truth: sometimes an object can be mutable in different ways, some of which will not invalidate existing references, and it can be annoying when the compiler can't tell which kind of mutability it's looking at. For instance, my_vec[0] = foo; requires mut my_vec but cannot reallocate, and yet the compiler doesn't distinguish it from the kind of mutability that can invalidate the other references (like my_vec.push(bar)).
I'm having trouble coming up with a concrete example where this is annoying, but I know I've run into it before with HashMaps before. I was trying to hold references to the (mutable) HashMap’s values, but couldn't because I was also mutating the HashMap (but not in a way that would invalidate any references). I had to replace the non-Copy values with a usize index into a Vec that stored the actual values I cared about.
It's not clear to me that different "kinds" of mutability (or any language level change) is required here: the sought functionality in a particular case can usually be provided by the library once the right abstraction/API has been identified.
Under the hood it'll (ultimately) be implemented with unsafe code, and it may also be possible to implement such downstream without waiting for the library to provide a new/safe API. Indeed, such downstream implementations are a common way to experiment before solutions are adopted in the standard library.
You may well find that hashbrown (which is the implementation underlying the standard library's HashMap but which contains fuller and lower level APIs) provides APIs upon which a solution to your problem can be built.