[Pre-RFC] Clamped types

Introduction

Currently, NonZero* integer types are implemented magically inside the compiler, and have the invariant that a zeroed bit pattern is an invalid representation. This enables several optimizations, such as representing an Option::<T>::Some(T) transparently, as the None variant is delegated to the zeroed bit pattern (this is also true for references via NPO). From now on, I will refer to these types that are a subset of the possible values of a larger type as “clamped”.

However, there is currently no support for user-defined clamped types, and this Pre-RFC aims to tackle the issue. This feature would go a long way in having more compile-time checks which are currently gated by the const_exprs feature, and would essentially allow us to promote enum variants to types (an often requested feature) by clamping to a single variant. Custom-clamped numerical types is a feature request that I've personally heard quite frequently by people who work in embedded systems, as it would allow them to safely remove index checks and keep debug assertions on (they elsewhere eat the little performance they have to work with).

Body

There are essentially two ways of clamping a type, selecting a few elements, or all but a few of them. The former will be referred to as an inclusive clamp, and the latter as an exclusive one. Any type can be clamped, but there has to be a fundamental a distinction between enums, unions, and primitives:

  • Enums may be clamped to a subset of their variants.
  • Like enums, unions may also be clamped to a subset of their fields. Even if at first clamping them might look useless, since there is no way to statically or dynamically to check them per se, clamping a union results in statically disallowing accessing the excluded variants, which adds a lot of safety to such a footgun-prone feature.
  • Primitives, in the case of numbers, may be clamped to entire ranges via the same range sugar we use today. However, in the case of the char and bool primitives, they may be clamped to a set of specific values.

I don't have the insight of whether clamping associated types in bounds is feasible in terms of development time, but it is of my mind that we'd ideally like to support that at some point.

Type system and type layout

To the type system, a clamped type C of T is different from its supertype T, but their layout is the same (that is, it is valid to transmute or pointer-cast between, provided the value is a valid representation of the target type). However, the layout of any type which contains C is allowed to be different from the otherwise T; for example, if T := usize and C := 1usize.. (all of usize but 0), even though transmuting between T and C is sound, a function that accepts C will not accept T (more on that in the following section), and while Option<T>'s layout will be two words, Option<C>'s will just be one. If it is possible, we would like to pick the first/lowest/whatever invalid value of a type for their wrapped layout optimizations, not just 0 (I am not aware if that is the case right now or if it is feasible, though).

Subtyping semantics

Any clamped type C of T is a subtype of T, such that C may coerce to T while T may not coerce to C. Any clamped type C of T that is a subset of any other clamped type D of T (i.e., all possible values in C are also in D) may coerce to D. Clamped types that are to be converted to another clamped type (of the same supertype) which do not have a subset relationship, may do so at runtime with a fallible let clause that specifies (or infers) the target type, and the compiler will inject glue to check if the value is an element of the target type. Even though such a conversion is not cost-free at runtime, it follows Rust's zero-cost abstractions, as there is no other way to perform such a conversion safely. It would be convenient if there were From<C> for T and TryFrom<C> for T implementations where C is a subtype of T (clamped type or otherwise).

When operating with clamped types in patterns, only the variants that are a valid representation of said types may be specified. For example, an enum of three variants {A, B, C} that is clamped to {A, B} must not have a match case for C, as such a variant does not exist. However, either the unclamped or clamped type's variant may be specified in patterns, so as to not have to clamp the type when specifying the variant in every mach case, which worst-case would increase exponentially the written variants (and they're equal anyway); i.e, a clamped enum C of T can be matched the following way: match C { T::A => {}, _ => {} } where T::A is a valid variant of T and is in C.

Interactions with specialization, ft. floats

If we ever stabilize specialization, which is to be seen, we should strive for clamped types being specializable over their non-clamped counterparts. In my mind, if we can ship such a thing in the std, this would go a long way in saving floats of the pain that it is the fact that they do not implement Eq; as we would be able to implement it for f32::<!Nan> and f64::<!Nan>, and solve this problem.

Syntax

I honestly don't care as long as it does not break with our current “be boring” way of doing it. In my opinion, it is pretty obvious to use ! to specify an exclusive (“all but”) clamping, but for everything else, two syntaxes come to my mind:

  • $clamp in $type, like {!{Nan, -1.0..1.0}, 0.5} in f64 as the set that excludes NaN, and all values from -1.0 to 1.0 except for 0.5.
  • $type::<$clamp>, like f64::<{!{Nan, -1.0..1.0}, 0.5}>, expressing the same as above. I prefer this one, because of the turbofish and the supertype going first.

On enums, we should be able to skip the enum name in the clamping expression.

Conclusions

Even though this is a rather complex feature, my uninformed guess is that in comparison with other recent ones we've shipped, this should not be the hardest to implement; even though we definitely have better things to prioritize, this is, in my honest opinion, worth adding. Not only would it be helpful in a variety of the environments that Rust targets (like embedded), but it would also allow us to have enum variant types and Eq for floats.

5 Likes

Are you aware of the ongoing discussion regarding pattern types?

1 Like

I'm not at least. Are they happening here on IRLO? (zulip is so arcane and hard to follow) and is there a relevant rfc/pre-rfc anywhere? Links would be much appreciated.

Nope. I'll try to find it.

Well there are at least these big threads: [1] [2] [3] [4], plus links within.

4 Likes

And the current experiment is Implement pattern types in the type system by oli-obk · Pull Request #120131 · rust-lang/rust · GitHub

3 Likes

It seems I've reinvented the wheel. I'll try to contact the people involved to see if they can find something useful from my post.

More recent discussion: @joboet's pre-RFC and associated Zulip discussion

I assume you meant From<C> for T

Indeed, I always get it wrong, thanks!

Is there somewhere a list of topics that get requested / pre-RFC-ed again and again, with similar lists of old threads where those proposals were discussed before?

That may include the list of "please do not request yet again", never-to-be-implemented features I have seen somewhere (but cannot easily find).

Ideally such list should be somewhere on the RFCs site. (Update: filed an issue about this).

2 Likes

https://lang-team.rust-lang.org/frequently-requested-changes.html

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.