Introduction
Currently, NonZero*
integer types are implemented magically inside the compiler, and have the invariant that a zeroed bit pattern is an invalid representation. This enables several optimizations, such as representing an Option::<T>::Some(T)
transparently, as the None
variant is delegated to the zeroed bit pattern (this is also true for references via NPO). From now on, I will refer to these types that are a subset of the possible values of a larger type as “clamped”.
However, there is currently no support for user-defined clamped types, and this Pre-RFC aims to tackle the issue. This feature would go a long way in having more compile-time checks which are currently gated by the const_exprs
feature, and would essentially allow us to promote enum variants to types (an often requested feature) by clamping to a single variant. Custom-clamped numerical types is a feature request that I've personally heard quite frequently by people who work in embedded systems, as it would allow them to safely remove index checks and keep debug assertions on (they elsewhere eat the little performance they have to work with).
Body
There are essentially two ways of clamping a type, selecting a few elements, or all but a few of them. The former will be referred to as an inclusive clamp, and the latter as an exclusive one. Any type can be clamped, but there has to be a fundamental a distinction between enums, unions, and primitives:
- Enums may be clamped to a subset of their variants.
- Like enums, unions may also be clamped to a subset of their fields. Even if at first clamping them might look useless, since there is no way to statically or dynamically to check them per se, clamping a union results in statically disallowing accessing the excluded variants, which adds a lot of safety to such a footgun-prone feature.
- Primitives, in the case of numbers, may be clamped to entire ranges via the same range sugar we use today. However, in the case of the
char
andbool
primitives, they may be clamped to a set of specific values.
I don't have the insight of whether clamping associated types in bounds is feasible in terms of development time, but it is of my mind that we'd ideally like to support that at some point.
Type system and type layout
To the type system, a clamped type C
of T
is different from its supertype T
, but their layout is the same (that is, it is valid to transmute or pointer-cast between, provided the value is a valid representation of the target type). However, the layout of any type which contains C
is allowed to be different from the otherwise T
; for example, if T := usize
and C := 1usize..
(all of usize
but 0), even though transmuting between T
and C
is sound, a function that accepts C
will not accept T
(more on that in the following section), and while Option<T>
's layout will be two words, Option<C>
's will just be one. If it is possible, we would like to pick the first/lowest/whatever invalid value of a type for their wrapped layout optimizations, not just 0 (I am not aware if that is the case right now or if it is feasible, though).
Subtyping semantics
Any clamped type C
of T
is a subtype of T
, such that C
may coerce to T
while T
may not coerce to C
. Any clamped type C
of T
that is a subset of any other clamped type D
of T
(i.e., all possible values in C
are also in D
) may coerce to D
. Clamped types that are to be converted to another clamped type (of the same supertype) which do not have a subset relationship, may do so at runtime with a fallible let clause that specifies (or infers) the target type, and the compiler will inject glue to check if the value is an element of the target type. Even though such a conversion is not cost-free at runtime, it follows Rust's zero-cost abstractions, as there is no other way to perform such a conversion safely. It would be convenient if there were From<C> for T
and TryFrom<C> for T
implementations where C
is a subtype of T
(clamped type or otherwise).
When operating with clamped types in patterns, only the variants that are a valid representation of said types may be specified. For example, an enum of three variants {A, B, C}
that is clamped to {A, B}
must not have a match case for C
, as such a variant does not exist. However, either the unclamped or clamped type's variant may be specified in patterns, so as to not have to clamp the type when specifying the variant in every mach case, which worst-case would increase exponentially the written variants (and they're equal anyway); i.e, a clamped enum C
of T
can be matched the following way: match C { T::A => {}, _ => {} }
where T::A
is a valid variant of T
and is in C
.
Interactions with specialization, ft. floats
If we ever stabilize specialization, which is to be seen, we should strive for clamped types being specializable over their non-clamped counterparts. In my mind, if we can ship such a thing in the std, this would go a long way in saving floats of the pain that it is the fact that they do not implement Eq
; as we would be able to implement it for f32::<!Nan>
and f64::<!Nan>
, and solve this problem.
Syntax
I honestly don't care as long as it does not break with our current “be boring” way of doing it. In my opinion, it is pretty obvious to use !
to specify an exclusive (“all but”) clamping, but for everything else, two syntaxes come to my mind:
$clamp in $type
, like{!{Nan, -1.0..1.0}, 0.5} in f64
as the set that excludes NaN, and all values from -1.0 to 1.0 except for 0.5.$type::<$clamp>
, likef64::<{!{Nan, -1.0..1.0}, 0.5}>
, expressing the same as above. I prefer this one, because of the turbofish and the supertype going first.
On enums, we should be able to skip the enum name in the clamping expression.
Conclusions
Even though this is a rather complex feature, my uninformed guess is that in comparison with other recent ones we've shipped, this should not be the hardest to implement; even though we definitely have better things to prioritize, this is, in my honest opinion, worth adding. Not only would it be helpful in a variety of the environments that Rust targets (like embedded), but it would also allow us to have enum variant types and Eq
for floats.