impl<T> Clone for Cell<Rc<T>>

So with the recent expansion of Cell to move-based operations, we decided to leave potential further expansions to operations that require running “user code” to future work.


My thought process

I’d been idly musing since then that the right way to think about this is that &Cell is effectively a “third kind of reference” alongside & and &mut. Cell itself is opaque: every operation on Cell<T> is in terms of &Cell (except for those which convert between it and plain T), you can’t really do anything separately with the & and Cell parts of it, they act as a single unit. The core requirement for memory safety is that you can’t simultaneously have a mutable reference to a memory structure together with a reference to something inside of it (because mutation of the structure itself can invalidate the interior reference), and &, &mut, and &Cell each impose a different set of restrictions to ensure this:

  • & says you can have as many outer and inner references as you like, but you can’t mutate.

  • &mut says you can mutate, and you can take inner references, but only one reference can be active at a time.

  • &Cell says you can have any number of outer references and you can mutate, but you can’t take inner references at all.*

So essentially, multiple references, inner references, mutation: choose any two, with the above being the 3 possible answers.

Basically Cell is a fundamental language primitive masquerading as a standard library type.

All of this is a longwinded way of getting around to the point that I think "extensions of Cell" to provide further operations for specific types logically belong in the modules for those types, rather than in std::cell.

For instance, I’d been thinking that you could provide Clone for Rcs and Arcs inside Cells by having methods such as Rc::clone_cell(&Cell<Rc<T>>) -> Rc<T>. Or if we wanted to make that more general, we could have some kind of trait CellClone { fn clone_cell(&Cell<T>) -> T }, which is a variant of Clone where the self pointer type is &Cell.

All of that makes sense but it’s not really interesting or worth posting about.


The actual idea

So today I was randomly thinking about this and wondering if there wasn’t a cleaner way to do it, and I noticed that you could just write “T: CloneCell” as Cell<T>: Clone instead, which is functionally equivalent, and we can provide these impls:

impl<T> Clone for Cell<Rc<T>>
impl<T> Clone for Cell<Arc<T>>

This is equivalent in terms of expressivity to clone_cell from above, because Clone returns an owned Cell, and if you want to access the Rc or Arc directly you can just call get_mut or into_inner.

I think this is quite nice because it’s extremely straightforward, doesn’t involve any actual new API surface so there’s nothing to bikeshed over, and my impression is that being able to do clone()s of Rcs and Arcs inside Cells is like 80% of the Cell pain point, and it solves that. I think it’s also forwards-compatible should we want to generalize it to more types later.

This doesn’t solve the “whole problem” because you still can’t clone, e.g., a Cell<(Rc<T>, Rc<T>)>, not to mention other operations besides Clone, but we’re no closer to solving the general problem (or agreeing on a solution) than we were before, and there’s little cost and significant gain to providing these impls in the meantime, so I think we should just go ahead and do that.

5 Likes

I feel that there still is a qualitative difference between & and &Cell: Given a description of the invariants of T, it is in general not possible to derive the invariants of &T. For example, T and Cell<T> (in their "fully owned" variant) have exactly the same invariants; this is also witnessed by the fact that there are functions to convert between the two, and the functions are no-ops. On the other hand, &T and &Cell<T> are clearly very different types. However, given a description of T, it is actually straight-forward to say what the invariants of Cell<T> and &Cell<T> are.

Of course, interior mutability does extend the power of the language qualitatively, so in some sense Cell is no less fundamental that &. However, this also goes for Mutex and really any other unsafely implemented type that couldn't also be implemented entirely in safe code. It is perfectly conceivable that there are some specific T that could provide extra operations on Mutex<T>.

Fully agreed. :slight_smile: Cell has to be aware though, as this means the other types rely on the internal invariants of Cell, and not just its public interface.

Wow, I didn't realize this. Nice! This relies on us knowing the implementation of Rc::clone, and the fact that it could not possibly use another alias of the &Cell, right?

Hm, I wonder how hard it would be to prove these sound in our model...^^

I'm not sure whether I correctly understand what you mean by "have the same invariants", but if I do, I'm not sure this is enough to show that they do. Real equality (if "same invariants" is an appropriate stand-in for "equality" here!) would mean equality in any context, that is, not just first-order T <-> Cell<T> no-op conversions, but Foo<T> <-> Foo<Cell<T>> given any choice of Foo. But we can't allow that because it'd give us &T <-> &Cell<T> which'd violate the invariants of both.

I think that's just a nitpick, though: it's not clear to me what the broader thrust of your argument is and whether it's parrying the point I had intended to make. (Quite possibly it is and I'm just not understanding it well enough!) Another way of saying what I was trying to say is that instead of Cell, we could have just a CellRef type (or &cell, or &sharedmut, or whatever), with &'a mut T -> CellRef<'a, T> as its only constructor, being Copy and otherwise having an analogous interface to &Cell, and it would be equivalent in terms of expressiveness. EDIT: Bah I misremembered my own argument! :slight_smile: You do still need a Cell type, it just doesn't have any other API besides &Cell<T> -> CellRef<'a, T> (so that has two constructors, not one). Basically the only purpose of Cell itself is as an opaque barrier you can't take internal references into.

Yes, basically the idea is that the implementation of {Rc,Arc}::<T>::clone is (a) independent of T ("parametric") and (b) guaranteed not to traverse some memory cycle and end up mutating the original Cell.

So there hasn’t been much response (my discussion with @RalfJung is interesting but tangential to the main point). Given that this is "just an impl" would this need an RFC, or just a PR?

In what context would an Cell<Rc<T>> be used? It seems a bit similar to ArcCell but !Sync and has a more verbose API - instead of thing.get() you need (I think?) thing.clone().into_inner() which seems like it may be trying too hard to reuse types.

@SimonSapin had a use case in this comment: Extend Cell to non-Copy types by Amanieu · Pull Request #1651 · rust-lang/rfcs · GitHub

Two points. First off I'm not actually sure whether you'd more often want the return type to be Rc<T> or Cell<Rc<T>>, though it's definitely plausible that it's the former. The second is that even if we had a separate convenience function to return Rc<T> directly, given that you can impl Clone using it, it would be kind of strange if we did not also provide that. So given that it seems like we'd want impl Clone either way, we might as well add that first, given that it requires essentially zero design work, unlike the alternatives.

Can anyone say whether or not this is RFC material?

In that project I have a custom MoveCell which could now be replaced with std::cell::Cell. I’d need extension traits for these extra methods, but in the meantime I have:

impl<T> MoveCell<Option<T>> {
    pub fn is_none(&self) -> bool { … }
    pub fn take(&self) -> Option<T> { … }
}

impl<T> MoveCell<Option<Rc<T>>> {
    /// Return `Some` if this `Rc` is the only strong reference count, even if there are weak references.
    pub fn take_if_unique_strong(&self) -> Option<Rc<T>> { … }
}

impl<T> MoveCell<Option<Weak<T>>> {
    pub fn upgrade(&self) -> Option<Rc<T>> { … }
}

impl<T> MoveCell<T> where T: WellBehavedClone {
    pub fn clone_inner(&self) -> T { … }
}

pub unsafe trait WellBehavedClone: Clone {}
unsafe impl<T> WellBehavedClone for Rc<T> {}
unsafe impl<T> WellBehavedClone for Weak<T> {}
unsafe impl<T> WellBehavedClone for Option<T> where T: WellBehavedClone {}

See this explanation for why WellBehavedClone is needed.

Each of these methods uses UnsafeCell::get in an unsafe block.

This clone_inner method doesn’t fit the signature of the Clone trait (since it doesn’t return a Cell). But more generally, this shows that there is a number of operations that could be safely done with something inside Cell without moving it out and then back in (which requires leaving some default value behind in the meantime, and may or may not be optimized away in release mode). But these operations have to be carefully vetted to not break the safety of Cell. I don’t know of a good way to generalize this pattern without many new methods that need to use unsafe in their impl.

If I understand correctly, you also need support for cloning Cell<Option<Rc<T>>, so merely adding the Clone for Cell<Rc<T>> impl I suggest would not be enough?

I guess this idea isn’t really worth pursuing in this form then?

This particular project wouldn’t happen to use Clone for Cell<Rc<T>> (even if we ignore the fact that that returns Cell<Rc<T>>, which seems less useful than returning Rc<T>). Maybe other projects would, I don’t know.

To me it feels like too much of a special case: why Rc<T> but not Arc<T>, or Vec<Rc<T>>, or many other standard library types? Why clone but not Option::is_none, or many other operations? I think that something that goes into std should have a principled mechanism that can work with many types and operations, not a bunch of random one-offs.

Unfortunately I don’t have a suggestion for such mechanism that isn’t a foot-gun. The memory-safety of Cell<T> can be a bit subtle, when you start playing with &*cell.as_ptr().

With Rc there’s not much problem with doing rc.clone().is_none()

rc.clone().is_none()

Is the increment-then-decrement optimized away? Is it with Arc?

The more general concept here seems to be "method that does not, ever, touch (read to or write from) any Cell"; let's say such a method is "Cell-free". Such a method is fine to be called directly on a Cell, without copying anything out first. Rc::clone and Arc::clone are Cell-free... well, kind of. They actually do use Cell internally, but that Cell is never exposed to the user. Still, this shows that the proposed API is subtle; the Cell in Rc is now magically "different" from other Cells in that Cell-freeness is fine with this particular Cell being modified, while OTOH, Rc must not call a method on Cell just because that method is Cell-free. Vec<T>::clone and Option<T>::clone are Cell-free if T::clone is; similar for many other containers. Option::is_none is Cell-free.

Funny enough, there's actually a way Cell-freeness makes sense in my formal model; however, I don't see a way to encode this in the Rust type system -- there doesn't seem to be a way to mark a method as having some property (as opposed to marking types, which can be done with marker traits).

However, the proposal of using rc.clone().is_none() confuses me; how is that useful for Cell<Option<T>>? No Rc involved... I must be misunderstanding something.

A use case for constraint kinds :slight_smile: (that is, if the methods are encapsulated in a trait, we could apply a marker trait to T: Clone instead of a type)

1 Like

This would work, but as you noted it excludes Rc::clone. I think the real criteria for safety of manipulation of data inside Cell is “does not ever touch this particular cell again”, which is much more subtle to express systematically. Perhaps it helps to reason about reference cycles? Since a cycle would be needed to reach that cell again.

Yes, .clone().is_none() would be unsound with an arbitrary T in Cell<Option<T>>, it was only proposed for Cell<Option<Rc<U>>>.

FWIW, one conservative bound we can currently write that (I think!) ensures that no Cells will be accessed is Sync. Interestingly, this is sufficient for a whole bunch of things except for Rc<T> which was the motivation for the idea in the original post! But it’d work for, for example, Option and Vec.

I think you could safely have something like:

fn with_cell_contents<T, F, Z>(this: &Cell<T>, f: F) -> Z
   where F: FnOnce(&T) -> Z + Sync

in other words it is safe to pass a reference to the contents of the Cell to a function which is (because Cell is not Sync) guaranteed not to close over any Cells, and therefore, presumably, cannot have closed over the Cell whose contents you are giving it a reference to, and so cannot invalidate it with a set() or replace().

F: Sync only prevents cells from being captured by a closure, right? There could be other ways to reach the same cell. (Globals, a cycle from &T, …)

Oh, yeah, you’re right that F: Sync doesn’t imply T: Sync which I had for some reason assumed. But that’s easy to fix by just adding T: Sync if it’s necessary. I’m not sure about globals… you aren’t really ‘supposed’ to put Cells in globals anyway because of the possibility of multi-threaded access (i.e. it’s not Sync)? Can you come up with an example which would be “otherwise sound”, but become unsound if you add with_cell_contents?

You can certainly synchronize access to non-Sync globals, e.g. by locking (though in general you'd need a re-entrant lock to be able to produce multiple references in the same thread) or with thread locals (not a "global" in every sense of the word, but close enough to break with_cell_contents).

ISTR something to the effect that recursive mutexes are unsound in Rust as-is, precisely because you can get multiple &mut references out?

But yeah, thread locals seem like a problem :o( dang thread locals, always messing things up.

Oh, right, a re-entrant lock would have to hand out shared references, not mut references (cf. RwLock, but unlike RwLock it would not allow readers from multiple threads and could thus contain non-Sync data). There’s no such thing in std, and it may not be very useful, but it seems pretty obvious sound (and would still break with_cell_contents).

1 Like