Recent change to make exhaustiveness and uninhabited types play nicer together

I’ve argued (sort-of) for !(5) in that RFC. Or at least !(5) when you have invalid data exposed and pattern-matchable through an enclosing piece of data that you’re matching on.

We added such a feature in OCaml in the last version, precisely for the purpose of better handling empty types appearing deeply nested (otherwise there is a termination issue, of how deep you do your case-analysis to test for inhabitation, and a practical issue of performance of exhaustivity checking). I think this is a very good construction because it both guides the implementation (of the exhaustiveness checker) and expresses programmer's intent ("here is the reason why this case is dead").

.

Regarding the larger discussion: it is very common in programming languages that support modularity to have a notion of "abstract type", whose definition is not known. The client cannot build a value of such a type, but that does not mean that it is not inhabited (using an empty sum type for this is a very bad idea). Once you start reasoning on type equalities (for GADTs for example), the same phenomenon occurs, there are equalities that you can prove are true (from available definitions), equalities that you can prove are false (from available definitions), and equalities that you neither prove or disprove as they involve unknown/abstract types. In general it is important to carefully distinguish "I cannot prove X" and "I can prove not(X)": the closed-world assumption that lets you confuse the two notions is anti-modular.

One of the good things with the ML module systems is that they force the language design to be honest with respect to abstract type: modules have an interface that is separated from the implementation. You need abstract types in the language as soon as you allow encapsulation, as they describe the interface of private types. Rust does not have an interface language, so it does not ask itself the question of "what is the interface of a type whose definition is private?" -- the only natural answer would be "an abstract type". If you add the extern type Foo; syntax to the language, it will be useful not only to think of types defined through the FFI, but also to think of types from other crates that are visible but whose definition are private.

7 Likes

Can't overstate how important this is to support very abstract approaches.

If one looks at the similarities between Scala's underscore and Rust's, it feels like Rust should be able to source some additional flexibility from there:

Existential types

def foo(l: List[Option[_]]) = ...

Higher kinded type parameters

case class A[K[_],T](a: K[T])
1 Like

Could you elaborate here? There's no arbitrary recursion limits anywhere in the inhabitedness-checking code but I don't see how it could go into an infinite loop. We also cache the result of the inhabitedness check but there hasn't been any benchmarks comparing it with being disabled.

I did not participate in the performance measurements, but Jacques Garrigue reported his number in the relevant bugtracker entry, PR#6437. The costs seemed to be non-neglectible, see for example (Exh is for exhaustivity checking, while Unus for analysis of unused cases):

Here are a few more times, for make library on another machine. The number indicates the "exploding depth", i.e. the number of nested GADTs one may explode for analysis. Exh(2)+Unus(2): 77.0s Exh(1)+Unus(1): 6.55s Exh(1)+Unus(0): 6.35s Exh(0)+Unus(0): 5.95s Exh(1): 6.19s Exh(0): 5.75s

Re. non-termination: a naive implementation of inhabitation will say that if the type is a tuple/product type, then it is inhabited if all components are inhabited, and if is a sum/enumeration type, then it is inhabited if at least one component is inhabited. If you have a recursive type, repeatedly applying this strategy can loop infinitely. For a regular recursive type (type nat = Z | S of nat), you can detect that you are looping (a subgoal is the same as the original goal), and terminate there (deciding inhabitedness or not depending on whether your language allows recursive values). But if the recursive type is parametrized and its definition instantiates its parameter in an non-regular way (say type 'a fulltree = Leaf of 'a | Node of ('a * 'a) fulltree), then there is no clear termination strategy. Your language design might forbid that (those types also make monomorphization difficult), so maybe the performance aspect will speak more to you.

3 Likes

Ahh… yes it’s possible to make it go into an infinite loop then. In fact, that example just turned up at least three bugs:

This type should be uninhabited but isn’t: FullTree<!> where

enum FullTree<'a, T: 'a> {
    Leaf(T),
    Node(&'a FullTree<'a, (T, T)>),
}

I’m not even sure why this doesn’t go into an infinite loop, it seems like the inhabitedness here should be (niavely) undecidable.

This type gives a weird, spurious error message:

struct Foo<'a, T: 'a> {
    node: &'a Foo<'a, (T, T)>,
}
error[E0392]: parameter `T` is never used
 --> smeg.rs:3:18
  |
3 | struct Foo<'a, T: 'a> {
  |                ^ unused type parameter
  |
  = help: consider removing `T` or using a marker such as `std::marker::PhantomData`

error: aborting due to previous error

And this type causes rustc to stack overflow (in the uninhabitedness check):

struct Foo<'a, T: 'a> {
    _ph: PhantomData<T>,
    node: &'a Foo<'a, (T, T)>,
}

Damn.

Edit: The first bug isn’t really a problem and the second one is actually expected behaviour. I’ve filed a report for the third one and put it on my list of things to fix.

2 Likes

The idea of refutation clauses in that situation is that you do not unroll type definitions indefinitely, or at all: a pattern can be omitted only when its head constructor is “obviously” impossible in this context (if it is a GADT constructor that assumes an equality that is inconsistent in the current typing environment; empty enums would qualify as well, or !). If a pattern is impossible for a reason that is only visible in depth (for example it is of a tuple type T * U where T and U are “obviously” impossible), then the user has to write a refutation clause (_, _) -> . to explicitate the fact that this pattern is impossible. We asks the user to “prove” that the rest of the patterns are uninhabited, so they tell us exactly how deep we need to unroll the definitions.

(A nice side-effect of this feature is that while, in general, a non-exhaustive pattern matching emits a warning rather than an error, a refutation clause for which the type-checker cannot ensure that the pattern is indeed empty raises a hard error. So we can now use _ -> . after all patterns as a way to assert that the pattern-matching is exhaustive, and get a hard error if the datatype definitions change in a way that breaks this assertion.)

1 Like

Doesn't Rust already deal with "abstract types" in the form of type parameters in generic code?

Yes, type parameters are also abstract types, but there are use cases for abstract types (such as exporting a type to other crates without exporting its definition) that are not well-served by parametrization. This subtlety is, for example, at the key of module system design in the ML community: in principle universal and existential quantification suffice to express type abstraction, but in practice modular programming require code structures that they do not serve very well. (See also the discussions in the Scala community about “type members” versus “type parameters” and the fairly different code structures they make comfortable.)

If crate A exports an abstract type T that is used by crate B, to emulate the same pattern using a type parameter you would have to have B be parametrized over any signature of A using a type parameter T, and instantiate this parameter with A. To my knowledge the crate language does not allow such kind of crate-level parametrization and instantiation. (And even if it did, it would be very inconvenient to use in practice.)

(To exercise these type-checking situations using type parameters instead of a new construction for abstract types, you need a language construction that takes parameter, and can internally define new types and build new expressions. Is it possible to define a type locally to a parametrized function, or a struct or trait definition?)

1 Like

I'm not entirely sure what &! being matchck-uninhabited means, but &! seems to be a violation of the requirements of a & in safe code, if ! represents something that doesn't exist. Am I missing something?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.