What's a "zero-bit integer"?

ethindp · November 22, 2021, 6:27am

Eh, I still don't even see how you could represent a zero-bit integer. It has zero bits, so therefore occupies no space at all. I don't think the argument of "well ()/unit and struct foo; exists so we should have uint<0>/int<0>" applies here, since () indicates "I don't return anything" and struct foo; takes up actual space even though it doesn't have any fields, and so therefore is capable of being represented in some form (though what form that is is up to the compiler). So unless someone has discovered some magic way of creating a zero-bit integer, and LLVM supports it (and, according to the LLVM language reference manual, it doesn't), I think the entire discussion of zero-bit integers is irrelevant.

CAD97 · November 22, 2021, 7:18am

This is factually incorrect. Zero sized structures don't exist in C or C++ (because objects must have unique addresses), but the struct is actually zero sized in Rust.

struct Foo;

fn main() {
    dbg!(std::mem::size_of::<Foo>()); // 0
}

[src/main.rs:4] std::mem::size_of::<Foo>() = 0

scottmcm · November 22, 2021, 7:19am

That's a ZST, same as (), so it doesn't take up actual space. (Rust is different from C++, where sizeof(T) is always at least one. Though even there things like the empty-base-class optimization exist meaning that zero-field structs sometimes take zero space.)

RustC could support zero-bit integers the same as it does the other zero-sized types: by just not lowering them to anything in LLVM, since no operations on them ever do anything.

It's possible we don't want zero-bit integers, but that wouldn't be because of implementation problems. Over-128-bit integers are much more of an implementation complexity than 0-bit integers.

zackw · November 22, 2021, 3:25pm

I would imagine that the main reason to support 0-bit integers, would be if it meant that code written generically over integer size, did not need to special case size zero. I have no idea whether that would come up.

ckaran · November 22, 2021, 3:39pm

Macro generated code perhaps?

tcsc · November 22, 2021, 5:29pm

I suspect the opposite is true in practice — having zero bit integers feels like you might need to special case 0 to avoid a divide-by-zero.

notriddle · November 22, 2021, 7:09pm

Maybe zero-bit integers should be uninhabited, like an enum with zero variants?

scottmcm · November 22, 2021, 7:33pm

Definitely not. An N-bit integer represents 2^N possible values, so a 0-bit integer represents 1 possible value -- not the zero possible values that a zero-variant enum represents.

Equivalently, enum Zero { JustZero } also has size 0.

(Uninhabited types conceptually have a size of -∞ bits/bytes. union and struct sizes work as ⊕ and ⊗ respectively in a max-plus semiring.)

That's nothing new, though. It's why offset_from panics for ZSTs, for example.

Basically, ZST integers make sense for the same reason that 0-length arrays make sense -- not because you're specifically going to use them all over the place, but because they're handy for generic/macro usage.

--

EDIT: Thanks for rearranging, @‌notriddle!

mjbshaw · November 22, 2021, 7:38pm

You always have to watch out for divide-by-zero for all integer types. I don't think zero-bit integers change that in any way so it shouldn't be any more (or less) complex.

tcsc · November 22, 2021, 7:39pm

I mean in size computation, but I think I retract my statement now that I think about it more, and realize that this is an integer modulo 2^0, rather than an integer modulo 0 (which would be an inherent division by zero)

tczajka · November 22, 2021, 8:38pm

A u0 is perfectly suited for indexing an array of size 1.

i0 is a little bit unusual because there is no sign bit.

However, the sign bit is supposed to be the bit number n-1 = -1. Since bit number -1 is implicitly 0 for integers, it follows that the only value of i0 is 0 rather than -1.

mjbshaw · November 22, 2021, 8:49pm

There are lots of arguments for why i0 should be zero. But IMO the most compelling argument is ergonomics and element-of-least-surprise.

As a developer, I'd be really annoyed if Rust went out of its way to support i0 but then made it -1. If I was trying to support generic integers I'd have to add more special cases for i0 if it couldn't represent zero. Zero is extremely useful (both practically and formally).

Whether i0 is zero or -1 is an interesting philosophical question in the similar vein as other philosophical questions like "do numbers exist?" (please don't try to answer that question here). But I think that when it comes to being practical (not philosophical), "i0 is zero" is the only defensible position.

elidupree · November 22, 2021, 9:08pm

I mean, I strongly agree that if we DID commit to having i0 be a type, then zero would be the only reasonable choice for its singular value. My point wasn't "maybe we should have it be -1", but "maybe the fact that the generalization needs to be special-cased, means we shouldn't have i0 be a type in the first place". (On the other hand, u0 makes perfect sense – I would hope for u0 to be supported, for the reasons other people have stated, similar to zero-length arrays.)

tczajka · November 22, 2021, 9:16pm

It's doesn't really have to special-cased:

Two's complement is naturally defined as:

let x: uN;
x as iN = x - (x << 1 & 1 << N)

which works for N=0 giving 0u0 as i0 = 0.

elidupree · November 22, 2021, 9:22pm

Sorry, "needs to be special-cased" wasn't precise. More precisely, it needs to have an additional constraint added to the generalization. (In your definition, the constraint is expressed in the form of the << operator arbitrarily filling the low bits with zeros rather than ones; mathematically, one could just as easily write a definition that results in the singular value of i0 being -1.)

tczajka · November 22, 2021, 9:24pm

If << worked by filling in low bits with 1s, then my 1 << N wouldn't work for N=32 either.

Filling with 0s is not arbitrary, because 0b101 = 0b101.0, not 0b101.1.

elidupree · November 22, 2021, 9:38pm

Yes, there are good reasons that that's the conventional default, but it would be plenty valid to also have a complementary "<< but filling with 1s" operator – both of them are inverted by >>, and they express the complementary opposites of the minimum and maximum values that return to the same value when applying and equal-sized >>. (Note that 0b101.1111 rounds off to 0b101 under the conventional rounding).

If << filled with 1s by default, your expression could be written as (x << 1 & ~(0 << N)), which doesn't seem significantly more complicated – and I wouldn't be surprised if there was a shorter way to write it that I'm just not thinking of at the moment.

tczajka · November 22, 2021, 9:43pm

Shifting left shifts the binary point. It multiplies by 2. There is no rounding there. Shifting right requires rounding, shifting left requires no rounding.

The filling in with 1s is a strange idea. I have never heard of anybody proposing that before. There is no symmetry with filling in with 0s. It requires not just shifting the binary point, but also flipping a bit. It's "2x + 1" rather than just multiplying by 2.

This is already well established in all CPUs and programming languages, as opposed to u0 and i0 which are sort of a new idea, so not sure why we're talking about changing how shift left works.

elidupree · November 22, 2021, 9:46pm

Another way of stating my overall point is: whichever the value of i0 is, there will be generalizations that hold for all signed integers with a sign bit, which do not hold for i0, creating possible gotchas for when you generalize over signed integer sizes.

Now, I can also think of a solid counterargument to that – which is, there are SOME generalizations that don't hold for i1 either, because unlike larger integers, i1 isn't able to express the multiplicative identity. And if someone has a specific function to implement that doesn't work for i0, ideally, it should be possible to use const generics to implement it only for sizes greater than zero (and similarly, when needed, for sizes greater than one, etc.)

elidupree · November 22, 2021, 9:48pm

I am definitely not talking about changing how << works.

Topic		Replies	Views
Feature request: References to empty objects	30	1007	June 12, 2023
Towards even smaller structs language design	73	4356	August 17, 2021
How do exclusive references to zero sized types work? compiler	8	1530	August 3, 2020
Mem::uninitialized, `!` and trap representations language design	56	6490	March 25, 2019
Pre-RFC: Allow custom values in core::num::NonZero libs	2	461	December 8, 2022

What's a "zero-bit integer"?

Related Topics