Pre-(Pre-)RFC: niche types

(by the way, this is my first time submitting anything to Rust)

!!!!

This Pre-(Pre-)RFC is pretty much a moot point, due to its only real uses covered by existing types and c-like bitpacking not working with Rust's compiler model

!!!!

Edit 1: conservatively assume n8s are uninhabited

Feature Name: niche_types Start Date: sometime in the future RFC PR/Rust Issue: N/A

Summary:

Adds a primitive, 1 byte sized, type with an undefined memory representation.

Motivation:

While talking on the Rust Community Discord, there came a need for bitpacking structs. As part of it, creating niches would be necessary. Niches are bit patterns within a struct or enum which can be repurposed by an enclosing enum or, in the future, an enclosing struct. While this Pre(-Pre)-RFC does not propose bitpacking, it provides a utility to partially control Rust's niches.

Guide-level explanation:

This Pre-(Pre-)RFC adds one type, n8 (name not final), which has no memory representation, but takes up one byte of space. The actual byte used depends on the contents: an Option<n8> stores whether it is Some or None in the n8's byte, meaning you CANNOT rely on its actual value. This means transmuteing it to or from a u8, optionally within pointers, references, and arrays, is always Undefined Behavior.

Reference-level explanation:

See above. (TODO)

Drawbacks:

Most likely uninhabited and unusable until the implementation of future ideas.

Rationale and alternatives

Some implementations of binary formats require precise control of not only the layout of data, but how Rust can reason about it. A u8 could be wrapped to obtain similar developer experience, but will not have a n8's performance benefits. In that way, a n8 can be seen as an optimizing hint.

Prior Art

None, as Rust, to the best of my knowledge, has invented structure niches.

Unresolved questions

  • Would a n8 actually be constructable? What rules would need to be in place to construct one, or are they uninhabited? If n8s turn out to be uninhabited, would it make sense, in current Rust, to add it?

TODO, suggest some

Future possibilities

For this to be useful, an abi guarantee would be needed to allow certain structures containing it to be inhabited.

This is a stepping stone towards bitpacked and bit-level structs, which need padding (which can be created using normal types) and niches within them.

I believe you can already do this by using a #[repr(u8)] enum with a single unit variant.

You can even rely on its value being a specific byte.

I'm not sure I see how this type relates to bitpacking, do you have some concrete examples?

I believe you can already do this by using a #[repr(u8)] enum with a single unit variant.

You can even rely on its value being a specific byte.

The idea of niche types is to have 256 niches (instead of the enum's 255) and to not be able to rely on the byte's value.

I'm not sure I see how this type relates to bitpacking, do you have some concrete examples?

Suppose #[repr(bitpacked)] packs bits and that Rust allowed arbitrary bitwidth integers. You could then use that proposal alongside this one to have a #[repr(bitpacked)] struct Padded{ first: u1, /*NOT part of this idea*/ rest: n7, /*Also NOT part of this idea*/ }. Then, a Padded struct would have 128 niches, and thus could be composed in a #[repr(bitpacked)] struct Composed{ padded: Padded, data: u7, }, which would be only one byte in size.

This pre-(pre-)RFC would lay the groundwork for those structure niches, another making them actually work (not just large enum packing), and another one would propose bitpacked structs, similar to how I described them. An advisor in the Discord server told me to split the idea up so as to allow independent progress on all.

What do you mean by niche? Could you update the OP with a definition?

You should update the pitch as that really doesn't come across:

I would consider "a byte with one value" to have 255 niches.

So, 0 possible values but not uninhabited.

In a pure type-theoretic context, yes. However, in Rust, with the concept of undefined memory representation, it would be more like a unit type which spans but does not use one byte of memory.

So, 0 possible values but not uninhabited.

No, one possible value which is undefined in memory. That does raise the issue of construction of a n8 (and may even prove that n8s would be uninhabited, rendering this idea moot), as its bit pattern requiements depend on its context.

I'm not sure what you mean with undefined memory representation. Are you referring to uninitialized memory? That can take any bit pattern, so it would leave you with no niches at all.

1 Like

If I understand gkgoat1 correctly, the idea is that n8 is a "padding type" with one type-theoretical value "undefined" that can take any bit-pattern, except you can't rely on the value because the compiler is allowed to take arbitrary values as niches for Option<n8> etc.

In contrast, something like

#[repr(u8)]
enum ZeroByte {
    ZeroByte = 0
}

would only have an all-zero byte as a valid representation, but provides every other value as a niche.

1 Like

I guess the idea is to have a semantically single value, eg. something like

pub struct Niche(/* exposition only */ u8);

pub const NICHE: Niche = Niche(/* unspecified */);

impl Default for Niche { … }

where NICHE is the only value of Niche that exists from the user point of view. The number of bit patterns is immaterial. But the internal value is made available for the compiler to make use of. The value is not uninitialized, just unspecified.

1 Like

The problem with just making a type with 256 niches is that niches of a type are bit-patterns that are definitely not values of the type, so in order to be able to have a value of the type, at least one bit-pattern needs to be one that is a value of the type. So, a 1-byte type with 256 niches is necessarily an uninhabited type because all 256 possible-memory-contents are “not me”.

The goal of this proposal, as I understand it, is to be able to express adding niche space to a containing struct, but you can’t do this with only the current niche system; you’d need the compiler to do something different, more like “use this space freely for the containing enum’s discriminant”, which isn’t making a concrete “is this type / is not this type” distinction, but one up to the containing enum. (And I think that breaks the rule that blindly copying size_of::<T>() bytes into a &mut T is OK, because you’d start overwriting those spaces. Niches work because if the value is a T then you know you can overwrite those spaces.)

4 Likes

That's not possible. One key principle of uninitialized memory in Rust (and elsewhere) is that it's always okay to initialize uninit memory with some arbitrary value. It follows that every type that allows a given byte to be uninit, must allow it to have any value.

So 256 niche values is not possible.