On type systems and nature of the top type

Recently, an off-topic discussion about types sprouted in comments to Allow loops to return values other than () #961. That's obviously not the place for such discussion, and I'm ashamed to have been perpetuating it, but I also want to respond to the latest reply to me, as a matter of principle. As such, I'm continuing the conversation here.

In reply to canndrew:

The top type is the type which can hold values of any other type. () can do this, but you can't access the information once you convert into it.

Emphasis mine. This is obviously false. Consider Drop types. "Assigning" a value of drop type into () will observably destroy the value and invoke all associated callbacks. If () was a top type, you'd expect any value assigned to it to be dropped when variable goes out of scope.

In languages with a structural type system, the top type is the empty structure.

You are thinking of Golang-style interfaces. In Golang, empty interface{} is a top type, because it doesn't constrain what the value can do. Rust doesn't have any analog to this, except maybe in unconstrained generics.

I applaud you for taking the initiative.

For those who didn’t see that topic, the rationale is that this would allow Option<!> to be implicitly cast to (), which would conveniently (if rather obscurely) solve another problem.

Of course, that doesn’t need () to be a “top type”, and is not a solution many people liked.

I think you are not doing justice to the complex network of claims and counter-claims that has grown around what was supposed to be a minor counterpoint.

In distilled form: Someone proposed that they think anything should implicitly coerce to (), someone else noted this would make () a de facto top type, I said that such a thing would be mathematically unsound (actually, I used the word “horrid” in place of “unsound”), and ever since then, there’s back and forth with some people claiming () IS a top type (while not advocating the initial coercing idea), and others (including me) failing to explain that it’s not true.

Obligatory note: The above description is heavily colored by my assumption that I’m right. :stuck_out_tongue:

2 Likes

On the thread, it is pointed out that () is, actually, a top type, but that we don’t actually want coercions because that’d be bad for ergonomics. The assertions are separate.

This sounds like an argument over definitions, which is the worst kind of argument.

I suggest removing the word “top” from discussion; I think chances are that everybody already agrees on all accounts that are actually meaningful.

1 Like

There’s still the people who claim let x: () = 5; should be legal code…

I thought “top type” meant “the type that is a supertype of all other types” and vice versa for bottom types. I also thought the only sub/super-typing going on in Rust was via traits, and there simply is no sub/super relationship between any two non-trait-object concrete types. So…clearly I’m waaaay off somehow. What is the sub/supertyping relation everyone’s talking about here in which () has any significance at all, much less some form of top-ness?

(tl;dr, what @ExpHP said)

1 Like

So...clearly I'm waaaay off somehow. What is the sub/supertyping relation everyone's talking about here in which () has any significance at all, much less some form of top-ness?

I believe this comes from the general notion of "initial" and "terminal" objects in category theory (which are a concept that appears in many places all over mathematics)

Without getting too abstract, these concepts correspond to, for instance:

  • False and True in the category of propositions. (where the objects are propositions, and arrows are "implies" statements)
  • Minimum and maximum in the category of a preorder. (where the objects belong to a set, and the arrows are "a <= b" relations)
  • Empty sums like data Void and empty products like () in a category such as Hask, where the objects are Haskell types and the arrows are Haskell functions.

While I know almost nothing of type theory, it seems to me that people are conflating "bottom" and "top" with the terminal/initial objects of different categories:

  • @ubsan is arguing from the category where the objects are Rust types and the arrows are functions
  • @Ixrec and @le-jzr argue from the category where the objects are Rust types and the arrows are subtyping relations.

tl;dr: I don't think () is significant at all with regards to subtyping/supertyping specifically. But it is significant in a different manner.

4 Likes

So, asking from perspective of someone with zero intuition for such category, what kind of properties would you expect from the initial and terminal objects in terms of language design?

The terminal object () has the property that, for any type T, there is a unique function of type fn(T) -> ().

What do I mean by "unique?" Well, we will declare functions to be the same if they return equal outputs for equal inputs. So when I say there is a unique function with that signature, I mean that all functions with that signature return equal outputs for equal inputs.[1]

Now suppose that:

fn f(x: i32) -> i32;
fn g(x: i32) -> ();
fn h(x: ()) -> &'static str;

The fact that () is a terminal object allows us to reason about these functions in various ways simply by looking at the types:

  • We can argue that |x| g(f(x)) is the same as |x| g(x), because both have type fn(i32) -> (), and there is a unique function of that type.
  • This lets us further argue that |x| h(g(x)) is the same as |x| h(g(f(x))), by substitution.

The initial object ! has similar properties, but in the opposite direction; for any T, there is a unique function of type fn(!) -> T. (effectively, every such function is just |x| match x { })


[1] Of course, this only works for pure, total functions; I am basing this category off of Hask.

2 Likes

I apologize, that example is terrible. It really comes down to this:

Any computation that produces () can be replaced with (), and any computation that accepts x:! can be replaced with match x { }.

These facts follow straightforwardly from the “unique function” requirements, which are the true definition of these concepts. Notice that other unitlike and voidlike types meet this criteria as well (e.g. struct MyTerminal; and enum MyInitial { })

Sounds distinctly unhelpful in a language with side effects.

Certainly! In a language like Rust, I think these sort of concepts are more useful for helping recognize certain common problems in language design, so that we can potentially apply existing solutions from other languages (after taking into account the trickier bits like side-effects). This sort of recognition is what I think helps drives changes like canndrew's efforts to make ! a type, and ticki's proposals for generic constants.

Notice that even in Haskell, this categorical description is incomplete, because Haskell is not total:

foo :: Int -> Void        -- no such function exists in the category Hask...
foo = \n -> error "oops"  -- ...yet we can define one in Haskell.

But that hardly stops anyone from using concepts derived from Hask to make cool solutions to tough design problems.


Addendum: I think I get where you're going with this, though. Yes, I would agree the categorical properties of () are especially tenuous for Rust (much moreso than totalness or the properties of !).

people who claim let x: () = 5; should be legal

... but let x: () = <expression> is how you get the compiler to tell you what the heck type it thinks <expression> has! That won't work anymore if it becomes legal.

(I'm serious, but I'm also kidding, because maybe there should be a less cryptic way to do that, instead. #[print_type] <expression> for instance.)

2 Likes

IMO it is absolutely essential that the conceptual treatment of language features expresses what the language actually is like. I.e. saying "() is X" needs to have consequences on how code behaves, otherwise such assertion is meaningless.

If we say that () may conceptually be viewed as containing arbitrary values which we just cannot unwrap (as per canndrew’s claim), that’s in direct opposition to the reality of () being zero-sized, single value type in practice. The above conceptual claim has no grounding in consequences.

Such divorce between concept and reality then easily makes formal reasoning based on those concepts completely unsound. Case in point: applying theories that consider only pure functions, as in above post, could lead one to conclude that any expression typed () can be optimized away. The concepts resemble what’s in Rust, but there’s enough difference that the conclusions are invalid.

1 Like

FWIW I feel somewhat uneasy even with having ! in the language, but I think we made that bed when we decided you could "return !" to indicate non-termination (before it was a real type), and that it was inevitable that we would have to see that decision through to making ! first class.

FWIW, having ! is one of those things in Rust that makes me think “man, I wish I had that” in other languages. Like the other day I wrote this in C#:

catch (AggregateException ex) {
    ExceptionDispatchInfo.Capture(ex.InnerException).Throw();
    throw; // unreachable, but the compiler doesn't know that
}

Unit tests end up with similar flow-sensitive analysis problems, and they end up being patched with tool-specific attributes. Having ! in the type system is awesome :slight_smile:

For 'static types, i think it’s actually Box or * mut () that works more like a top type… e.g. hold values of any other type.

On the other hand () is actually a concrete type and it doesn’t help a lot to regard it as the top type, but it helps to see it as a “empty tuple” when working with tuples.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.