I am reminded of a conversation that @scalexm and I were having. We were discussing how to represent types and things, and the differences between rustc and chalk. It occurs to me that we might avoid the question if we adopted the “univariant” scheme I describe below, in which case everything is just a “variable” – but having written that out, I think the downsides are large enough it’s probably best to be avoided for now.
In Rustc, right now we have three distinct region variants, all of which might conceptually be considered variables:
-
ReLateBound – bound within some enclosing type or term
- e.g., the
'x in for<'x> T: Trait<'x> or for<'x> fn(&'x u32)
- of course if you are iterating over terms you might encounter it, particularly when you “skip” through a
Binder<T> type to the T within
-
ReSkolemized – a universally quantified thing bound in the “environment”, essentially (inference context, more specifically)
- this is used when doing the ‘leak check’, but is also the subject of discussion
-
ReVar – an existentially quantified thing bound in the environment (again, inference context, more specifically)
In Chalk, we also have some distinctions, but fewer. We represent the equivalent of ReVar and ReLateBound using the same variant – “variable”, but we use debruijn indices to track the difference. That is, if you have an inference variable with index 0, then when you enter a for<'x> binder, it would be referenced with index 1:
T: Foo<'?0, for<'x> fn(&'?0 u32, &'?1 u32)>
^^^ ^^^
these are both references to inference variable zero
In Chalk, we still distinguish the “universal” case, which we represent as a kind of !1 where 1 is a universe index, but one could imagine not doing that. Instead, we could represent everything with just a debruijn index. When that debruijn index exceeds the binders in the term, it refers into the surrounding inference context, and that context would track whether it was universally bound (“skolemized”) or existentially bound (“inference variable”).
I find this appealingly simple. It has some efficiency gains (perhaps minor):
Notably, if you have a fully canonicalized term, it is essentially a “mini inference context”. We can create an inference context around it without doing any substitution of any kind (chalk is also able to do this).
But I also see some concrete downsides:
First off, you wouldn’t be able to use match and a variant to determine which of the three cases you are dealing with; you’d just get an index and you’d have to go look into the inference context to figure it out.
Second, right now we keep “things that refer into the inference context” interned and allocated in a separate memory arena, and we use the variants to figure out where things belong. The scheme breaks down if ?0 is “sometimes” bound within a global term and sometimes the surrounding infcx.