Random thought of the day: "magic cell", a Cell-like interface for all data

nikomatsakis · August 24, 2016, 11:22am

So I was thinking a bit about Cell and RefCell. I often find that people are not sure of the best way to use them – particularly RefCell – and the problem is that if you use them wrong, you are likely to get panics and frustration.

Background: my philosophy on cell and ref-cell

At its most simple, my advice is usually as follows: use Cell when the data is copy; otherwise, only hold the RefCell lock long enough to store some data in or to clone the data out and return it. If the data is too expensive to clone regularly, use an Rc or Arc so it becomes cheap. Further, you should package up all access to those fields in simple accessors so you can easily audit what is being done.

More generally, the idea is to avoid holding the lock and then doing complex things. So e.g. returning a Ref or invoking a closure while holding the lock is to be done with care, because you can’t easily audit what will be done while the lock is held. Likewise, invoking other methods on self is risky, since those methods may evolve to be quite complex and may start trying to use other fields.

Note that this effectively models what you can do in most GC-languages, like Java or Ocaml, where you can’t take the address of a field, just load from it or store to it. This is no accident, I think, since aliased-mutability is so omnipresent in those languages. (IOW, every field in a Java class is effectively a Cell<Gc<T>>, where Gc<T> is some Copy wrapper type for Gc-managed memory.)

A “magic cell” that only offers get/set

This leads me to an interesting thought. I think that the Cell type, with its get and set APIs, represents a better interface than RefCell. It avoids the danger of holding the lock too long and I think is just generally more intuitive. Unfortunately, it is only applicable to Copy types.

What if we offered a single type, let’s call it MagicCell, that supports the Cell API (i.e., get and set) but works for all Clone data. I was thinking it would use specialization so that, internally, it uses a Cell if the data is Copy and a RefCell otherwise. We can then promote this type as the Preferred Way to handle aliased-mutability (that is, market RefCell as a kind of specialty tool for more advanced scenarios).

What we could even do, which would probably be better, is to just make cell itself be this “magic” type. IOW, all existing uses of Cell continue to work (and have no space overhead), but you can now also use it for Clone data (at the cost of a flag).

(We might then want to have a variant, ReflessCell, that guarantees no flag, but works as Cell today, except that it should also support a swap method. This could used if you really want to optimize.)

dan_t · August 24, 2016, 11:48am

Hmm, I can see the convenience of MagicCell, but I’m asking myself when would I use big data - which I somehow connect to Clone types - that’s mutable and gets copied on each mutation?

I can see how code using MagicCell is easier to write but it will also most likely easier hide potential performance problems.

dan_t · August 24, 2016, 11:50am

And the copying would have to be done twice, for get and set, right?

DanielKeep · August 24, 2016, 12:16pm

This would be my concern. It feels like a serious performance footgun: anyone who can't correctly pick between Cell and RefCell (where we can just say "try Cell, falling back to RefCell if that doesn't work) seems unlikely to correctly pick between MagicCell and RefCell, which requires them to now also consider Clone performance.

What about using specialisation to have a RefCell-like type which, if it's used with a Copy type, also gains get, set, and loses the flag?

dan_t · August 24, 2016, 12:56pm

For performance sensitive folks it’s really great marketing to be able to tell them there’s no implicit expensive copying done in Rust, implicit are only cheap moves and expensive copies are always explicit with clone.

nikomatsakis · August 24, 2016, 1:21pm

Just on calls to get. Basically a call to get() would be defined as copy the data out. I can see the concern about the footgun, but then I think it has to be balanced against the (very real) footgun represented by the more general interface of RefCell. (Moreover, I think this footgun, to a large extent, already exists – after all, if your data implements Clone, then it’s easy enough to invoke it without giving it a lot of thought.)

In any case, it seems clear that the thing to do is to prototype the idea in a crates.io package first, regardless.

nikomatsakis · August 24, 2016, 1:22pm

Note that nothing implicit would be happening here. There would always be an explicit call to trigger the clone.

dan_t · August 24, 2016, 1:52pm

Yes, you’re right, the get call would be implicit and the set could just move into MagicCell. Perhaps having clone in some way to be part of the get method name might be good idea.

Yes, you can always call clone without much thinking, but being able to design your data structures with less thinking seems to be somehow more dubious and on the long run more harmful.

eddyb · August 24, 2016, 5:55pm

Only marginally related, https://github.com/rust-lang/rfcs/issues/1106 has a description of a “magical” analysis that statically prevents what would be a panic in RefCell::borrow_mut, but without any flag overhead.

nrc · August 24, 2016, 11:14pm

Perhaps it doesn’t replace RefCell, but complements it, i.e., you have Cell for Copy data, MagicCell (I would suggest CloneCell) for Clone data (and which is optimised as you suggest for Copy data), and RefCell for data which is not Clone. I.e., we add another point on the spectrum for Cell things?

Alternatively just do what you describe for Cell if it is back compat, and then Cell is the copying/cloning thing and RefCell is the ref’ing thing. I.e., a better Cell for Clone data, rather than a Cell-like interface for all data.

nrc · August 25, 2016, 12:53am

cc https://github.com/rust-lang/rfcs/pull/1651

Manishearth · August 25, 2016, 9:55am

FWIW, the ‘mitochondria’ crate exists and used to have CloneCell.

We could implement it there if the rfc isn’t accepted.

arielb1 · August 25, 2016, 10:57am

BTW, I am quite sure that <Rc<T>>::clone is pure enough that a CloneCell<Rc<T>> does not need a flag - I used to think of a RcCell<T> (but to emulate Java, we also need a RefCell<Rc<Option<T>>>).

nikomatsakis · August 25, 2016, 10:56pm

I only called it "magic cell" for clarity -- my preference would indeed be to "upgrade" cell in place and just recommend it as the most readily understood model (RefCell being more powerful, but also easier to get wrong).

I think what you're saying is that we could use specialization to avoid the flags even for some non-copy types? If so, that seems true.

rkjnsn · September 15, 2016, 11:51pm

Would it make sense to have an unsafe trait so types could opt in to not needing a flag if they meet the requirements?

nikomatsakis · March 25, 2019, 8:26am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Rename RefCell to give it a descriptive name	7	1705	March 25, 2019
`ref` and `ref mut` for pattern matching RefCell and Cell like Box language design	11	2674	June 3, 2019
Wondering about interior mutability in old rustc	2	578	June 12, 2021
AtomicCell<T> - actually a Box with Cell semantics and lock-free atomicity libs	5	4136	March 25, 2019
Exploring Interior Mutability, Auto Traits and Side Effects language design	14	2617	February 24, 2021

Random thought of the day: "magic cell", a Cell-like interface for all data

Background: my philosophy on cell and ref-cell

A “magic cell” that only offers get/set

Related topics