What's a "zero-bit integer"?

Eh, I still don't even see how you could represent a zero-bit integer. It has zero bits, so therefore occupies no space at all. I don't think the argument of "well ()/unit and struct foo; exists so we should have uint<0>/int<0>" applies here, since () indicates "I don't return anything" and struct foo; takes up actual space even though it doesn't have any fields, and so therefore is capable of being represented in some form (though what form that is is up to the compiler). So unless someone has discovered some magic way of creating a zero-bit integer, and LLVM supports it (and, according to the LLVM language reference manual, it doesn't), I think the entire discussion of zero-bit integers is irrelevant.

This is factually incorrect. Zero sized structures don't exist in C or C++ (because objects must have unique addresses), but the struct is actually zero sized in Rust.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=52482029237e8cfc594431a640f80554

struct Foo;

fn main() {
    dbg!(std::mem::size_of::<Foo>()); // 0
}
[src/main.rs:4] std::mem::size_of::<Foo>() = 0
8 Likes

That's a ZST, same as (), so it doesn't take up actual space. (Rust is different from C++, where sizeof(T) is always at least one. Though even there things like the empty-base-class optimization exist meaning that zero-field structs sometimes take zero space.)

RustC could support zero-bit integers the same as it does the other zero-sized types: by just not lowering them to anything in LLVM, since no operations on them ever do anything.

It's possible we don't want zero-bit integers, but that wouldn't be because of implementation problems. Over-128-bit integers are much more of an implementation complexity than 0-bit integers.

10 Likes

I would imagine that the main reason to support 0-bit integers, would be if it meant that code written generically over integer size, did not need to special case size zero. I have no idea whether that would come up.

13 Likes

Macro generated code perhaps?

I suspect the opposite is true in practice β€” having zero bit integers feels like you might need to special case 0 to avoid a divide-by-zero.

Maybe zero-bit integers should be uninhabited, like an enum with zero variants?

Definitely not. An N-bit integer represents 2N possible values, so a 0-bit integer represents 1 possible value -- not the zero possible values that a zero-variant enum represents.

Equivalently, enum Zero { JustZero } also has size 0.

(Uninhabited types conceptually have a size of -∞ bits/bytes. union and struct sizes work as βŠ• and βŠ— respectively in a max-plus semiring.)

That's nothing new, though. It's why offset_from panics for ZSTs, for example.

Basically, ZST integers make sense for the same reason that 0-length arrays make sense -- not because you're specifically going to use them all over the place, but because they're handy for generic/macro usage.

--

EDIT: Thanks for rearranging, @β€Œnotriddle!

17 Likes

You always have to watch out for divide-by-zero for all integer types. I don't think zero-bit integers change that in any way so it shouldn't be any more (or less) complex.

1 Like

I mean in size computation, but I think I retract my statement now that I think about it more, and realize that this is an integer modulo 2^0, rather than an integer modulo 0 (which would be an inherent division by zero)

1 Like

A u0 is perfectly suited for indexing an array of size 1.

i0 is a little bit unusual because there is no sign bit.

However, the sign bit is supposed to be the bit number n-1 = -1. Since bit number -1 is implicitly 0 for integers, it follows that the only value of i0 is 0 rather than -1.

There are lots of arguments for why i0 should be zero. But IMO the most compelling argument is ergonomics and element-of-least-surprise.

As a developer, I'd be really annoyed if Rust went out of its way to support i0 but then made it -1. If I was trying to support generic integers I'd have to add more special cases for i0 if it couldn't represent zero. Zero is extremely useful (both practically and formally).

Whether i0 is zero or -1 is an interesting philosophical question in the similar vein as other philosophical questions like "do numbers exist?" (please don't try to answer that question here). But I think that when it comes to being practical (not philosophical), "i0 is zero" is the only defensible position.

11 Likes

I mean, I strongly agree that if we DID commit to having i0 be a type, then zero would be the only reasonable choice for its singular value. My point wasn't "maybe we should have it be -1", but "maybe the fact that the generalization needs to be special-cased, means we shouldn't have i0 be a type in the first place". (On the other hand, u0 makes perfect sense – I would hope for u0 to be supported, for the reasons other people have stated, similar to zero-length arrays.)

It's doesn't really have to special-cased:

Two's complement is naturally defined as:

let x: uN;
x as iN = x - (x << 1 & 1 << N)

which works for N=0 giving 0u0 as i0 = 0.

2 Likes

Sorry, "needs to be special-cased" wasn't precise. More precisely, it needs to have an additional constraint added to the generalization. (In your definition, the constraint is expressed in the form of the << operator arbitrarily filling the low bits with zeros rather than ones; mathematically, one could just as easily write a definition that results in the singular value of i0 being -1.)

1 Like

If << worked by filling in low bits with 1s, then my 1 << N wouldn't work for N=32 either.

Filling with 0s is not arbitrary, because 0b101 = 0b101.0, not 0b101.1.

Yes, there are good reasons that that's the conventional default, but it would be plenty valid to also have a complementary "<< but filling with 1s" operator – both of them are inverted by >>, and they express the complementary opposites of the minimum and maximum values that return to the same value when applying and equal-sized >>. (Note that 0b101.1111 rounds off to 0b101 under the conventional rounding).

If << filled with 1s by default, your expression could be written as (x << 1 & ~(0 << N)), which doesn't seem significantly more complicated – and I wouldn't be surprised if there was a shorter way to write it that I'm just not thinking of at the moment.

Shifting left shifts the binary point. It multiplies by 2. There is no rounding there. Shifting right requires rounding, shifting left requires no rounding.

The filling in with 1s is a strange idea. I have never heard of anybody proposing that before. There is no symmetry with filling in with 0s. It requires not just shifting the binary point, but also flipping a bit. It's "2x + 1" rather than just multiplying by 2.

This is already well established in all CPUs and programming languages, as opposed to u0 and i0 which are sort of a new idea, so not sure why we're talking about changing how shift left works.

1 Like

Another way of stating my overall point is: whichever the value of i0 is, there will be generalizations that hold for all signed integers with a sign bit, which do not hold for i0, creating possible gotchas for when you generalize over signed integer sizes.

Now, I can also think of a solid counterargument to that – which is, there are SOME generalizations that don't hold for i1 either, because unlike larger integers, i1 isn't able to express the multiplicative identity. And if someone has a specific function to implement that doesn't work for i0, ideally, it should be possible to use const generics to implement it only for sizes greater than zero (and similarly, when needed, for sizes greater than one, etc.)

3 Likes

I am definitely not talking about changing how << works.