The biggest reason I want to get rid of a lot of the RefCells is for clarity (though performance is a nice bonus). For the most part, the RefCells aren’t necessary anyway–there are usually just a couple of contexts to pass around and nearly all the borrows are extremely short (presumably due to fears of ICEs). That is, I think most of the disadvantages you are referring to (threading structures through, etc.) are already effectively happening, it’s just not being reflected in the type signatures.
The reason I submitted these patches was because I was having trouble figuring out what was going on due to the pervasive RefCell
usage. For someone who hasn’t worked with the codebase much, like myself, it is really hard to tell what it’s safe to modify. Even if the total number of RefCell-related ICEs these days is low, I suspect the price of that is that many parts of the compiler have to operate in very granular phases, (with mutability during each phase restricted to a couple of variables). While having distinct phases is certainly not a bad thing, it feels to me like it’s very difficult to change the phases because everything has to fit just-so. Again, I haven’t worked with the code as much as you, so I could be wrong about this, but my impression is that this has resulted in many “logical” passes having their code scattered all over the compiler, since they have to do a bit of work in each context (before the variables become sacrosanct).
The other problem I have is that RefCell
ends up being insidiously viral–people end up using it solely because the way the code is written, mutable lifetimes can’t possibly work. By adopting conservative lifetimes (&mut
), it becomes a lot easier to refactor, as well as to see whether something is actually being used. One example is the Block
arena in trans
; it turned out we didn’t need the fcx
in the blocks, and as eddyb pointed out we may not even need the arena at all. In typeck
a variable in Inherited
that wasn’t even in use. There are a lot of other little examples of things like that.
I am not familiar enough with the compiler internals to address the “right datastructure” question definitively; however, my experience has been that a combination of Cell
s and arenas are really effective for representing self-referential datastructures, which it seems like is what you’re referring to (though I wasn’t familiar with the word ivar
). However, I don’t think it’s enough by itself. For rustc, I suspect at any time you can divide a context into twoish parts: one with clear ownership and mutable collections where you are building up data (often temporary data), and a longer-lived context of mostly-immutable data that is pointer spaghetti. That is a pattern that seems to come up pretty often for me in other Rust projects, and it seems correct conceptually as well.
I agree with many of the issues people are raising (re: ugliness) but I also think it is, for the most part, correctable. A lot of the ugliness is because I am not trying to remove every RefCell in our application at once, which is why we have to carry around several. From experience, though, if you just pass around a mutable context from the getgo (particularly if you are writing methods on impl
blocks on the context), a lot of these problems go away.
The remaining issues are, IMO, not really fundamental problems with the approach; they’re indicative of missing syntax (that I’ve wanted before). Specifically: Rust doesn’t provide any convenient sugar for temporary groups of references. The behavior you want is for the mutable references in the group to be reborrowed on each call, each with a distinct lifetime, and not have to write any of the outer lifetimes out. You get basically the behavior you want if you consistently pass the references in as separate parameters, but it’s verbose (and also kind of poor as an abstraction). impl
seems like it should be the least verbose option, but does make it really annoying to split up the groups. If you bundle up the references directly into a structure, you have to include explicit lifetime parameters, and it will move out every time you call a function (assuming one of them is &mut
); if you try to address that by passing &mut
references, and you have references with different lifetimes (if one of them is immutable this is likely), you have to keep deconstructing and reconstructing the group every time you want to pass to a new function. Such sugar would be useful for a lot more projects than just rustc and (I think) would address most of the ugliness concerns; but I might be being overoptimistic about its effectiveness.