Pre-RFC: Type system seeing numbers as trait implementors

I would like to put together an RFC in the near future about encapsulating the integer set within the type system. Those familiar with the typenum crate can probably skim-through the next part.

The current standardised way of using number tools within the type system, is to use const generics. This is highly valuable when you need to specify a value available for use at runtime, but compile time defined. Case in point, when using arrays, and need to be generic with respect to its size, but monomorphised to a specific length: const LEN: usize.

This language feature has its limitations, and the many nightly features that address various aspects of its limitations act as one way of illustrating this.

Here is my current draft thesis statement:

If you abandon the notion of an integer being a primitive value, and replace that thinking with an integer being a marker type that implements traits, it opens the door to exciting possibilities.

For example, consider a scenario where you want to constrain a method in terms of numbers, such that A < 6 and > B, and that B > 2, but also you want to multiply that, and round up to the next multiple of 8. This may have appear contrived, but it's coming from actual needs. Thinking of the needed math in terms of values, it might look like this

fn x<A: < 6 && > B, B: >2>() ->  (A * B + 7) / 8 {
    todo!()
}

The solution to this, at present, is to use typenum

using the typenum crate, this would look a little like this:

fn x<A, B>() 
where 
    A: Unsigned + IsGreater<B, Output = True> + Mul<B> + IsLess<U6, Output = True>,
    B: Unsigned + IsGreater<U2, Output = True>,
    op!(A * B): Add<U7>,
    op!(A * B + U7): Div<U8>,

-> op!((A * B + U7) / U8) {
    todo!()
}

To whit:

  • A can be any type,but it must be a natural number that implements the < 6 trait such that it's true
  • it also must implement the * B trait, and that traits associated type Output must also implement the + 7 trait.
  • the + 7 imple must have an output which itself impls the / 8 trait
  • the return type is the output of that / 8 trait
  • B must implement the natural number trait
  • B must also implement the > 2 trait with the output being true

This is all done without having to think of machine representation (usize or any of the uNs), and works since stable 1.37.0, and looks like it would work with first stable 2018 edition (1.31.0). Compiler gives 2 errors: Self constructors (issue #51994), and passing an extern crate without --extern (issue #53130) .

Why not just use typenum as a dependency?

This crate is particularly novel, in that it's sole focus is to implement a language design feature. It has more in common with GATs, const generics and compile time utilities that work with types (such as orphan rule checker), than it does with code that lis in the nursery. I believe that in order to fully realise the potential value that this approach can bring to the language, the compiler itself would need to see numbers from this "a set of types that implements traits" perspective directly.

  • using typenum::U7 when the constraint is < 6 spits out an unintelligible error. Tools exist to address this, but solutions should ideally be made upstream.
  • though an extremely clever implementation of functional programming type theory, the inherent nature of it puts a lot of weight on the compiler. Integrating it into the type system would allow the source of the this burden to be bypassed.
  • bridging between types and const generics is non trivial in many respects. Typenum often eases this bridging, and with nightly features, this accounts for many use cases. Numbers as types baked into the language is likely to make the level of complexity purely a function of the complexity of the solution, and be eased, rather than exacerbated, by the language and its standard tooling.

Before I publish a formal RFC, I would like to hear out what people have to say on this. I'm particularly interested in hearing from those that have some involvement in the work on const generics, GATs and other initiatives that endeavour to improve the accessibility and utility of cool type wizardry.

2 Likes

I can't be 100% certain, but this code snippet looks like any grammar recognizing it could be fairly ambiguous, due to the overloaded used of < and > as both delimiters within which generic variables and bounds can be introduced, and as comparison operators.

Such language-level ambiguity has already caused issues within the Rust language, and is in fact the origin of the turbofish operator syntax (i.e. its why we write .collect::<Vec<_>>() rather than .collect<Vec<_>>()).

1 Like

100% agree. I only intended that as an ilustrative example. I'm so certain about what the ideal syntax would look like. The typenum way of doing it works, but it's clunky. I think types-as-numbers becoming part of the type-system internals, it would be much easier to find a non-clunky, Just Works, ergonomic idiom.

There's no specific technical benefit to using type generics or const generics; they end up with the same functionality and limitations either way. The reason typenum is more powerful than stable const generics is not because the approach is superior, but because what stable const generics allow has so far been conservative.

Using one potential syntax, your example function could be written:

struct Const<const N: usize>;
fn x<const A: usize, const B: usize>() -> Const<{(A * B + 7) / 8}>
where
    const { A > B },
    const { A < 6 },
    const { B > 2 },
{
    todo!()
}

In my mind, this has the semantics that the caller must uphold as a trait bound the three comparison bounds, but the WF bound on {(A * B + 7) / 8} is deferred to monomorphization.

The most subtly complex thing involved here is that people would prefer for A > B and B > 2 together to imply that A > 2. This is the one advantage of reusing trait syntax; it's less of a stretch to tell users that they must spell out each individual required bound syntactically. It's my expectation that eventual stabilization of feature(generic_const_exprs) and feature(associated_const_equality) will require exact syntactic equivalence, but this isn't a simple pitch to make.

But while I expect this is the only practical way to let generic const manipulation work (without making compilation reliant on an unspecified SMT solver), I also expect that for it to be practically useful will require some form of “static if” to allow callers to shift pre-mono bounds to post-mono assertions when they believe the bounds are already implied.

1 Like

I strongly disagree with the notion that types and const generics are inherently the same, and do not have different limitations, and there is no technical benefit. If that was the case, typenum would have been made entirely redundant the day const generics got stabalised, and typenum as a dependency would be a code-smell that a project is unmaintained.

  • types do not need machine representation. u32 and usize are different types, but machine representation at compile time is of no consequence, until you explicitly require one, such as with array lengths, another would be if you want 2s complement roll-over if doing compile-time math.
  • I am able to express, on stable rust, from version 1.37.0 not only everything currently being worked on to mitigate the limits of const generics, but I expect many things that are not feasibly solvable without unfortunate hacks.
  • I have run into lots of friction integrating together different interfaces that use const generics of different unsigned types. My solution was to fork the dependencies and use typenum to erase machine representation. I was then able to solve the problem without headache, friction, or limitation.

with your example, I am forced to only use usize. as soon as I need to use it in a context where I'm depending on an interface that uses u64, it's going to be a headache of a time to make work. If I do need word-size explicit value, e.g. usize, U7::USIZE will provide.

To reiterate, the reason const generics' application is limited is not a fundamental limitation, it's entirely because the initial stabilization is deliberately minimal. The entirety of typenum's functionality is possible to do with const generics on nightly today, with the singular exception that typenum uses arbitrary precision integers.

So in effect, what you want isn't that numbers should be types, it's for the language to provide arbitrary precision compile time only integers the way typenum does.

It does not fundamentally matter whether they are held by generic types or generic consts; the two are isomorphic, and, modulo implementation limitations, you can trivially map between the two. For typenum, it's typenum::generic_const_mappings::U to go from const usize to type and typenum::marker_traits::Unsigned::USIZE to go from type to const usize.

Conversions between different fixed size integer types are equivalently just as possible, and it's only a matter of exposing it in a sustainable way. as casts currently require feature(generic_const_exprs), but this isn't strictly necessary, since as casts between integer types are total.


Secondarily, it's likely that you assume that taking an approach more similar to typenum would be available on stable sooner than with generic const exprs, given that typenum already works on stable. But, if you want any of the diagnostic improvements, then you run headfirst into the exact complications which are preventing stabilizing more powerful const capabilities.

If we assume that stabilization will happen at the exact same instant either way, would you still prefer a marker type solution over arbitrary precision generic const N: iNN parameters? Given that the two are isomorphic, I think that stabilization call is a fair one to make. And in that case, I'd prefer incrementally improving fixed sized const generics to reach that point over waiting to use the same functionality but spelled differently.


TL;DR the solution for development of the language to const generics being limited is to incrementally remove the limitations on const generics. That it's possible for typenum to solve needs today on stable is awesome, but this means nothing for what the forward-looking solution best for the language is.

Rust is entirely fine with delegating functionality to outside of std. Anything stable in std is frozen forever and the API can never be changed in a way that breaks API stability. The solution to this isn't putting interim solutions into std and deprecating them later when a better solution is available; it's making usage of non-std libraries easier and communicating that this is fine and expected.

#[diagnostics::on_unimplemented] is available now; it's likely typenum can use it to improve diagnostics some. But in cases where the errors are still opaque due to how typenum models integers, the language adopting the same approach will result in exactly the same problems.

If you want to help drive things forward, get on Zulip and ask around the generic const exprs working group what needs to be done. I suspect you'll find that most of it is impl work that is the same whether packaged into type generics or const generics.

4 Likes

Ideas for changes to the type system should always be discussed with the types team first (Zulip, t-types stream), but the response to this will likely be that the team does not have the capacity to implement or design this feature right now.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.