Pre-RFC: Generic integers (uint<N> and int<N>)

Since rust has ZSTs, I would be very surprised if `uint<0>` wasn’t just the obvious 0-bit type.

While I agree that `int<0>` is weird, since it doesn’t have the sign bit its name implies it must, there are other properties that can give it a meaning, like the fact that `transmute<uint<N>, int<N>>(0)` is still zero for all `N`, so `int<0>` is also a ZST representing `0`. Similarly, `int<N>::default()` is `0` for all `N`, so should be for `N == 0` too.

5 Likes

Right; I considered ZSTs as well, but I didn’t find a rationale for those. Yours is a good one

1 Like

I agree with your reasoning, @scottmcm. Technically, `int<N>` doesn’t break the invariant that it lies between the range [-2n-1, 2n-1-1], as zero is the only integer in the range [-½, ½]. I didn’t actually think of it this way until just now.

2 Likes

Continuing on the discussion about monomorphization errors, I think that the RFC should have a plan for > 128 such that before the feature is stabilized, arbitrary size const N: usize are supported.

Would you mind elaborating what you think this plan should have? I started writing something but mostly all that I could think of were implementation details, and guarantees about `size_of::<uint<N>>()`.

I’m not so much interested in the technical details right now (and I’m hardly an expert in this field…) What I’m looking for is a commitment to support `N > 128`, that’s all

PS: If you can think of implementation details, those are nice to jot down.

Excellent point! I’d written it out with ⌈⌉, but that looked terrible, so deleted it.

nit: You mean [−2n-1, 2n-1), since the formula you have actually generates [-½,-½].

soapbox: Once again, half-open ranges are superior to trying to futz about with ±1

3 Likes

An idea for related sugar is to also support supplying the range of values that is required from the integer instead of supplying the number of bits you require and leave it to the compiler to map this to an appropriate underlying N-bit representation. In some cases, this would allow design intent to be expressed more directly.

(This idea is inspired by Ada which supports fairly robust integer subrange functionality. For example, an integer with a range of 10…300 which the compiler then maps to a concrete representation, does runtime range checks on, etc., etc.)

Although it does qualitatively feel different from the intentions of the pre-RFC, there is some overlap especially in terms of the idea of mapping the logical uint type to a concrete underlying hardware type.

This. It kind of seems wasteful to add a core language feature which would be redundant with another, more generic (no pun intended) feature. A library-defined `Bits<const expression>` would definitely be useful, and a nice example of a use case for `const` generics. AFAICT no use case mentioned so far would require that these types be primitives/builtins.

+1. Again, such a request seems to stem from the usual “but it’s so easy in other languages!” fallacy. `usize` not being a mere relabelling of `u32` or `u64` or `u(sizeof(pointer))` is not an accidental pain point or a design error; it’s a deliberate decision which forces programmers to think about whether they need an exact- and constant-sized integer, or a pointer-sized, variable-width (across platforms) integer.

(Incidentally, the same argument seems to come up almost constantly and every single time the discussion is shifted to a feature in Rust that exposes a problem and forces its users to think about it, instead of silently doing the wrong thing. It’s been cited in the case error handling, `from()`/`into()` conversions, and who knows what other features. But it’s still not a good idea, for the same reasons, to give up correctness for marginal convenience.)

5 Likes

One issue is that making `Bits<32>` be the same type as `u32` wouldn’t really work in a library. In theory you could do it by making `Bits` a type alias for an associated type projection – `type Bits<const N: usize> = <N as GetBitsType>::Type` – but then impls like `impl<const N: usize> Foo for Bits<N>` would be disallowed by coherence rules. More practically, it would have to be a separate type.

Yeah, I think I’d prefer true range types as well – though perhaps there should be both. In theory, there could be range and bit-width syntaxes for the same group of types: `range<0, 127>` could be the same type as `uint<7>`. However, offhand, I’d expect adding two `range<0, 127>` values to produce a `range<0, 254>`, while to be consistent with the existing builtin integer types, adding two `uint<7>`s would have to yield another `uint<7>`. An alternative would be to support both `range<A, B>` and `uint<N>` as entirely separate groups of types, but that seems like a confusing proliferation of integer types.

2 Likes

Sure, although I’m pretty sure that’s a feature, not a bug.

I find the Motivation of the RFC not ‘motivational’ enough. It seems the main motivation is interop with C bitfield structs, which is fair, however bitfields are somewhat niche even in C and it is not clear if the RFC is intended to solve just that problem or more and if it is the right solution.

Additionally, the C `MipsInstruction` example itself is questionable, since C struct bitfield layout is implementation-dependent, ie. in standard portable C you should not use bitfields to decode/encode a mips instruction.

Finally, as @comex pointed out, with const generics custom-width integers would be best implemented in a library.

Sorry for so much criticism.

2 Likes

I personally think that both arbitrary-bitwidth integers and ranged integers are things that should belong in a library, not as language primitives.

I do not see anything about these types that prevents them from being implemented as a library (after Rust supports const generics) and requires them to be language primitives.

I believe that the language should be kept minimal and things should only be added to `std` (even more so for the language itself) if and only if they really genuinely provide big advantages over a library-based solution.

Why not wait for const generics to land, and then experiment with writing libraries to work with arbitrary-bitwidth integers, ranged types, arbitrary-dimension arrays, fixnums, etc… before writing any RFC to try to add any of this to the language? I think it is important to first get a clear idea of any limitations/drawbacks of a library-based solution.

5 Likes

I haven’t been able to read most of the comments, but I wanted to add: has everybody seen the `ux` crate?

This doesn’t have compile-time parametricity, but it IS a relevant precedent in the ecosystem.

2 Likes

Const generics merely allocate syntax for such types, which in fact is used by this very proposal. The meat of this feature is restricting the range of valid bit patterns for a given type, which is an ABI decision ultimately made by the compiler. Unless Rust provides a generic facility allowing user code to manually specify all valid bit patterns for a given type, restricted-range types will require dedicated support from the compiler, and types with dedicated compiler support belong in `core` (and `std`, but preferably `core`).

So excuse me, but this suggestion seems pretty nonsensical.

I think the suggestion isn’t nonsensical, but rather that an RFC should propose a general layout control mechanism so `uint<N>` can be a library addition rather than “builtin types”.

See the `NonZeroU32` discussions for speculation about such a thing. As a strawman, `#[repr(transparent)] struct BasePlaneChar(u16); unsafe impl RestrictedRange for BasePlaneChar { const VALID: RangeInclusive<u128> = 0xE000 ..= 0xD7FF; }`.

Yes, this looks like the sort of thing I had in mind. But notice how you used a trait in your strawman syntax: that trait will have to be declared as existing somewhere, and the most natural place for it is `core`. So something is going to be added to the standard library anyway, unless it’s done entirely with attributes.

(Given that this is a strawman, let me refute it: it doesn’t seem general enough to cover multiple ranges, or alignment bits in pointers, and the range you did write is written backwards. Which is confusing, even if I can assume it to be valid by imagining circular topology for ranges.)

Either way, merely saying ‘const generics’ without further elaboration doesn’t really answer the question of how to address this use case. Which is my real point here.

1 Like

Re `#[repr(bitfields)]`, I’ve got a draft RFC for a much more general feature, applicable to individual fields: I named it `#[compact]`. Might post it soon.

Or, in fact, bitfields in C. Sure, they are not as flexible as this proposal, but they do serve as (part of the) motivation for this feature, and therefore count as precedent.

(Is that related to https://github.com/rust-lang/rfcs/issues/311 ?)

1 Like

Strange minds think alike, I guess.

2 Likes

This is almost exactly what I was envisioning for `repr(bitfields)`, so, well done.