Pre-RFC: Generic integers v2

What if 52-bits (unsigned) integer with AVX support?

There are many operations with 52-bits integer(such as _mm_madd52hi_avx_epu64)

Good point, unless I'm mistaken they come from fact that they are needed for floating point (64-bit double has a 52 bit mantisse/fraction) computation and due to SIMD already exist multiple times.

Which makes it more difficult to recommend which one to use. Perhaps there should be (as someone mentioned above) a section in the documentation that talks about all the "interesting" sizes: powers of 2, 52-bit, 7-bit (for things that can be in options/results), with some suggestions (including a notice on hardware/ISA differences that may affect this), perhaps some information about what the compiler does to those non-power-of-two sizes on different architectures (with the note that it may change in future versions).


Slightly different topic: What do you think about having (in addition to uN / int<N>, types with a compile time known size) a DST that can hold an arbitrary size number (within the limits of LLVM/Hardware/Memory of course)? This would be conceptually like [T] is to [T; N] and would allow use cases where you don't know the size you need at compile time. At the cost of allowing fewer optimizations.

I think that would be useful enough to add (there certainly are real-world use-cases), might not result in too much overhead if done together with int<N> and it does match other types/concepts that already exist. The downside could be that it encourages people to use more dynamically-sized-types, but that same argument goes for [T].

As an example: I you want to write a calculator, or perhaps some analysis where you don't know the desired accuracy in advance, you currently have to make your own number type or guess at compile time what you might need (or over provision and take a really large number).

Don't get me wrong, I'm not proposing having all numbers be infinite in size, just that it might be worth considering them as a kind of int<infinite> or int<var> in addition (Probably a terrible name suggestion, especially since that cannot be done with const generics - var_sized_int?). With the length either stored next to the pointer or in the first field as a usize.

Presumably LLVM (or any other backend) should be able to optimise cases like these well. That is, assuming that they actually are faster than just treating them as 64-bit integers.

The main issue with this is that, unless we offer an owned bigint inside the standard library (which I consider disjoint from this RFC), there's not really a whole lot that can be useful about a bigint slice on its own. Sure, you can do simple unary operations like count_zeros, compare it with other integers, and maybe even do an in-place negation if you wanna be fancy, but a majority of operations would require allocation and so it's probably not going to be a very intuitive API. There are also some unstable methods that exist to make slice representations a bit easier to use.

I like this idea, but there are questions we'd have to answer first, and again, we could still add this later once this implementation exists. Only issue is naming the type, which we've also been discussing anyway.

1 Like

This seems to me like a much worse type system than what we have today.

If assignment introduces an implicit narrowing conversion, then inlining or uninligning variables would no longer be semantically a no-op, as it would add or remove an extra conversion step.

The lack of any such implicit conversions on assignment is a big benefit Rust has today over C.

3 Likes

The main reason why u128 has a larger alignment requirement is entirely because of the C ABI, and I think we should try to match that where possible. I agree that it makes sense to go with usize as an upper bound for alignment, but that ship has mostly sailed and I don't think that changing the semantics for other types between the gaps would be a good idea. But this discussion isn't over, and the C folks are also debating over their current representation, so, I'm mostly leaning toward just going with the final decision they make.

That said…

I've been thinking about this, and am leaning more toward keeping the alignment constraints but tightening up on the size. The main issue that comes to mind is that this unaligns the internal bits relative to endianness in some cases and would require masking out any fields that overlap with them before performing operations in some cases.

However, while you can up the size and alignment with wrapper types, you can't decrease them easily, which is why I'm leaning toward this being a decent idea.

Wait, is this saying that u24 would have size 3 align 4? That's fundamentally not a thing that the Rust type system allows.

2 Likes

I'm unsure what you mean here. If u<65> is ⌈65/8⌉ = 9 bytes, it has to have alignment 1, as that's the only power-of-two factor of 9.

My first stab at the alignment would be to say that the alignment would be alignof(u<N>) = min(alignof(usize), sizeof(u<N>).prev_power_of_two()). (Lots of other possibilities, though.) This is wrong; see later posts.

Notice I never said implicit in my post. Something like x = (a + 2 * b + c)/4 would work with nothing, if all the variables involved had the same range, because the basic value range analysis would get it right.

Then it'd be a discussion on how to do it for things that do end up with different types. Maybe it's a = (a + 1).checked_into().unwrap();; I don't know. After all, that ugliness would be all the more reason that you should be using a Range for such things instead of writing it yourself.

But we should stop designing this in clarfonthey's thread, since it's off-topic.

1 Like

Right, I seem to have completely forgotten how alignment works. This is what happens when I try to think these things up on my phone.

I guess the biggest thing would be that, if we were to do the size-tightening changes, then we'd have to make the C ABI aliases newtype wrappers instead of actual aliases. Which I guess is fine, and we can give them variable alignment with compiler magic if needed. You only need to send them across FFI boundaries, after all.

that gives the wrong value, e.g. for uint<72> which is 9 bytes it gives an alignment of 8 on x86-64, which is wrong since the size 9 is not divisible by the alignment 8.

the proper way to compute the max alignment you can have without padding bytes is gcd(some_max_align, sizeof(the_type)). gcd is the greatest common divisor. some_max_align must be a power of two.

You're right; complete brain fart on my part :person_facepalming:

And because alignment is a power-of-two there's a nice simple way to say that:

std::cmp::min(1 << size_in_bytes.trailing_zeros(), align_of::<u128>())

(And I checked my work this time, https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c866b0382dcaa1a15d19d5b85a7e41cc.)

3 Likes

Did another pass of editing and added in some stuff that people were concerned about:

  • Explained how niche optimisation works for <Option<uint<7>>::None
  • Explained why bool is separate from uint<1>
  • Made ABI/size/alignment explicitly unspecified for now, and in its own section
  • Explicitly link generic_const_exprs feature as one of the barriers to implementing certain methods
  • Elaborate a bit more on why bounds-based integers aren't necessarily the best choice (IMQHO/TBH)
  • Mention unsized integer types as an extension
  • Mention never-overflowing operations (e.g. uint<8> + uint<8> = uint<9>) as an extension
  • Mention generic floats

Based upon everything I've seen, I think that the RFC is mostly ready, although my biggest concern is still with the ABI/ranged integer stuff.

To me, ranged integers seem like they're extremely prone to what I call "inference creep" where users will never be satisfied with their ranges not being inferred precisely based upon what they're doing in the code, and that's why I feel like leveraging pattern types if they exist would be a good compromise. I also brought up the "what type are the integers in the range" argument here because, while uint<N> can rely on N always being a usize, int<A..=B> explicitly requires A and B to be int<A..=B> or at least int<A..=A> and int<B..=B> and it's really unclear how we'd do this. Keeping them relegated to patterns (where types don't matter, and the patterns are just patterns) would probably help with this.

But again, I don't want to just rely on what I think is the right decision here, and would appreciate more feedback.

(also re: N: usize, I think that it's fine that even if we generalise uint to something that can include usize, to just make usize special and thus be fine to include cyclically in its definition. the issue with int<A..=B> is specifically that the type of A and B have to be effectively infinitely large or just a special BoundsOnlyIntegerNoUsePls type, and I'm not sure how we'd do that)

1 Like

Much as I want ranged integers, I think that making general integers that work exactly like u32 and friends -- no semantic changes at all! -- is super-valuable, and hope to see that regardless of any discussion about ranged integers.

Letting u32 be u<32> is super useful, and conveniently means we can immediately shut down a whole bunch of bike sheds (even in places where I wish u32 worked differently).

4 Likes

I have already been using ranged integers via a crate I created. I've also played around with more advances situations on nightly; inference was a non-issue in my experience. This problem is absolutely solvable. Portraying pattern types as sufficient to encompass the use cases of ranged integers is misleading at best, as it cannot handle trivial things like arithmetic.

Like it or not, ranged integers may plausibly exist in the future. There are a number of people who have experimented in this area, including at least one team member. There should be consideration given to the naming for this reason. That you use int in your example for ranged integers speaks to the confusion that will unquestionably occur if both generic integers and ranged integers are accepted.

Surely it isn't too much to bikeshed a single identifier for future compatibility?

This is an argument that generic integers are needed, not that ranged integers are not. No one in this thread that has advocated for ranged integers has also suggested that generic integers are not worthwhile, merely that ranged integers should be taken into account.

aside

As an aside, I have thought about this exact issue previously. My personal thinking is that, as a compiler built-in, any integer type should be able to be passed, including mixing and matching.

I'll be honest, I'm a bit confused by what you're trying to advocate for. I assume that the crate you're referencing is deranged, in which case, the ranged integers would still be bound by the generic integers here.

My main source of confusion is the fact that all of the existing ranged integer proposals are int<A..=B>, which completely omit bit information. If this were, say, int<N, A..=B>, that would be fine, under the assumption that A and B were limited to be int<N>. This isn't the kind of proposal I've seen, however.

If we were to, for example, add int<N, A..=B>, then the naming is a non-issue; we can easily have a second argument that defaults to the full range possible for the given bit size. But this isn't the kind of proposal I've seen, and it's one that wants entirely generic range-based integers which figure out the bit size, or are somehow independent from them. That's why I'm confused.


Also rereading the aside now: the main issue with having the ranges be a special compiler integer type is that this doesn't let you use them in generic code, since you still have to be able to name their type. I would assume that this is a requirement for any potential implementation; is there a case where it wouldn't be?

1 Like

Slightly different syntax but the same concept. I like your suggestion of int<N, A..=B> more because its easier to read: int<32, 5..=10>, but I think its more difficult to implement: You may need to:

  • allow std::ops::RangeToInclusive in a const generic: <const Range: RangeToInclusive> OR
  • use the TraitBounds trait here and figure out when they are equivalent OR
  • modify the compiler/language in such a way that this range isn't seen as a generic parameter.

In addition to that: In the first two cases this parameter would need to have a default value (so you don't have to specify a range), which also depends on the minimum and maximum of N.

One thing to consider is that this might add overhead to code generation, as the following types would all be different and thus their Addition, Subtraction, ... trait implementations would be (unless there is special handling in the compiler for that): int<32>, int<32, 5..=10>, int<32, 20..=25>, int<32, 5..=25>


The following (about range bounds) is far out of scope for this proposal, but I think it's worth considering for the long-term (and in my opinion a better place for bounded integers, as the bounds do/should not have an effect on the memory representation). I'm pretty sure something similar to this was already proposed somewhere else:

I also liked kornel's suggestion of having it "outside" of the type/generics, which would require language changes as well, but may be more flexible in the end, allowing further restrictions to what a value could be (for example that it is always even):

type TypeAlias = int<32> in 5..=10;
fn do_something_1(value: int<32> in 5..=10) {}
fn do_something_2(value: TypeAlias) {}

Just an example on what could be possible, which might be really hard with generics.

// Here `divisible_by_two` indicates the allowed values. 
type Even = int<32> in divisible_by_two;
// A function may not be a good solution to this, as there
// would have to be a way to indicate that one identifier is more
// strict than another to allow "casting" to a more restricted range:
// `int<32> in 5..=10` -> `int<32> in 5..=8`.
fn divisible_by_two(value: int<32>) bool {
    value % 2 == 0
}
// A more restrictive limiter
fn divisible_by_ten(value: int<32>) bool {
    value % 10 == 0
}
// We still need the ability to tell the compiler when two types are
// compatible. To show the concept I'm using the `From`/`Into` trait,
// I don't know if we want this to require an explicit `.into()`.
// Since there is no change in the memory representation this trait
// wouldn't even need a function.
impl From<int<32> in divisible_by_ten> for int<32> in divisible_by_two {
    /**/
}

Granted, something like this can already be done using Wrapper types, but the same goes for basic range bounded integers like they've been suggested in earlier comments.

When having bounded integers there almost certainly will be people that want finer grained restrictions, even when they have to tell the compiler when two restrictions are "compatible". In theory a static analysis tool could even try checking the correctness of the compatibility statements. Thus allowing arbitrary flexibility in how precise those restrictions should be without having to infer/prove that in the compiler. The compiler/language would just provide the means to do so and ideally provide a way to "forward" those restriction guarantees over function calls that don't have it.

1 Like

This is x86 specific, only applies to fma operations, and loads 64 bits packed values, which means that it wouldn't be possible to use this on an array of 52 bits values without some expensive shifting and shuffling to unpack the values first.

Yes. I means, should we use u/i<bits/align> instead of u/i<bits>?

For example, i<32> is i<32,32>, and i<32,8> use align =1, i<4,4> allow use part of a byte.

In this case, the type is actually i<52,64>, has a 8-bytes align

Why not just use i64 and truncate the result then ?

This instruction also expect one of the operand to be a 64 bits integer, so it's not a "52 bits only FMA". Furthermore, this only works for truncating multiplication.

And even then, this only applies to 52 bits integers, what about i53 ? i51 ? What about non AVX512 CPUs ? The fact that there is one ISA that supports one type of operation on one uncommon integer size doesn't really help.

IMO SIMD performance should not really be the focus of this pre-rfc, because (again, in my opinion), arbitrary-width integer arrays manipulation will always come at a cost.

As a quick demo, here's LLVM generating exactly the same code for i47 and i64: https://llvm.godbolt.org/z/Tbde7ov67.

1 Like