Apologies for further derailing — this thread was separated from another discussion by a moderator — but what’s the etymology for “niche” as used by Rust? Is it a term used outside of the Rust community (e.g., general programming language theory)?

# The term "niche" when referring to disallowed bit patterns and reusing them in option-like enums

The etymology is "@eddyb had to pick a name when implementing this". It's a Rusty term. I do not know of another language having a name for the same concept.

Let's seriously decide on something that's closer to being self-explanatory, like "invalid values" and maybe "invalid (value) reuse"?

As a native speaker of American English and the responsible editor for over 15.000 pages of IEC computer communications standards, and as a once mathematician, I prefer "niche".

Computer representations of information are all bit strings, usually of fixed length for a given elementary item of information. With analogy to other basic Rust types, we can denote an N-bit representation bit string as `bN`

. Various semantic mapping functions are used to map from the `bN`

representation **domain** to various semantic **ranges** such as `uN`

, `iN`

and `fpN`

.

The semantic mapping from a `bN`

representation to `uN`

is unipartite (i.e., one-part).

The semantic mapping from a `bN`

representation to `iN`

is usually unipartite, but it is possible to map the distinguished bit string `0b10…0`

, usually representing -2^{N-1}, to a distinguished value, such as `OVERFLOW`

, in which case the mapping would be bipartite (i.e., two part). In that case we could say that a *niche* has been carved out of the traditional two's-complement mapping to create a representation for that `OVERFLOW`

value.

Similarly the Rust `Option<T>`

type, when applied to a `NonZeroN<U>`

base type, reserves the representation domain `bN`

value `0b0…0`

for `None`

while the other representation `bN`

values are used to represent the `NonZeroN<U>`

base type. This is another bipartite semantic mapping.

The semantic mapping from a `bN`

representation to IEEE floating-point `fpN`

is five-part: normalized values, denormalized values, ±∞, quiet NaNs, and signaling NaNs. In this case the niche is not just for a singleton representation-domain value like the previous `OVERFLOW`

or `None`

, but entire subranges of the representation domain for the denormalized values and the two classes of NaNs.

@eddyb Please note that, by the above analysis, none of the values in any representation domain `bN`

are "invalid values"; that designation is logically incorrect. Rather, the representation **domain** has been partitioned into a multi-part semantic mapping to disjoint **ranges** of semantic concepts. To come back to use of the word "niche", I find it to be an appropriate shorthand way to refer to the existence of such multi-partite mappings, and in particular to the portions of the representation **domain** that have been carved out to map to the secondary (in some sense) semantic ranges.

@Tom-Phinney Some of what you are saying reminds be of what I wrote up recently in this PR, though our terminology is not entirely aligned. I think your "semantic mapping function" corresponds to my "value representation relation". A difference is that I define *one* semantic domain of values and then have *partial* relations for each type whereas you are carving out the semantic domain covered by each type separately.

I chose a relation as opposed to a function because sometimes you start with a representation and want a semantic value, and sometimes you need to go the other way. I wanted to avoid making one of the two directions the "primary" one, the way a function does. A relation is also strictly more general, e.g. I am not convinced that for unions in C, the mapping from representations to semantic values actually *is* functional. I sure hope we can make it functional in Rust, but I think it is still useful to have the terminology to talk about the more general case as well.

Also I think the relation is somewhat easier to define (see my PR for a few examples); using a function would IMO only make sense if we define it in a way that actually *computes* the semantic value from the concrete representation. In contrast, I am stating the relation in a more propositional, less computational style, making it shorter and easier to read and understand. But of course then one has to be careful not to accidentally relation one representation to two values.

*If* the relation is functional (in the representation -> value direction), it could equivalently be viewed as a partial function from representations to semantic values, and that's your semantic mapping function. And then if you consider the subset of the domain of *all* values that can actually be hit by this function, that's your semantic range for a particular type.

I am not familiar with the "unipartite"/"bipartite"/"N-part" terminology you are using. I know about bipartite graphs, but I don't think "unipartite graphs" are a thing and anyway you are talking about a function or relation, not a graph.

I agree though that "invalid value" is a bad choice, but for a different reason: "value" refers to the "high-level" view whereas niches are inherently about the "low-level" view -- so, in the context of my PR, a better name would be "invalid byte lists", aka byte lists that are not related to any value, or "invalid representations" if one wants to be less concrete.

I used the term "mapping" in a mathematical sense, because the projection from the representation **domain** to the semantic **range**, or potentially set of disjoint semantic **ranges**, is neither **1-to-1** nor **onto**. This is particularly apparent when considering mappings between `bN`

and the disjoint semantic range of `fpN`

, which I view as the real number field ℝ augmented by +∞, -∞, and the sets of quiet NANs and signaling NaNs. The mapping is not **1-to-1** because there are two `bN`

representations for zero, +0 (`0b00…0`

) and -0 (`0b10…0`

). It is not **onto** because there are an infinite number of values in the real number field ℝ that have no exact representation in `bN`

.

I consider the semantic ranges **disjoint** because the sets of quiet NaNs and signaling NaNs and the values +∞ and -∞ are not part of the real number field ℝ. While the values +∞ and -∞ can be considered to be "extensions" of that field that are not, in some sense, "disjoint", that logic cannot be applied to the sets of quiet NaNs and signaling NaNs. The fact that `fpN`

in Rust does not satisfy the trait `PartialOrd`

, because NaNs cannot be compared in an ordinal way, clearly demonstrates their disjointness from the real number field ℝ even when it is augmented by ±∞.

I feel that it is necessary to reach some degree of common understanding here before bikeshedding specific terminology such as **relation**. Also please note that I used the term "function" only once, to assist in the explanation of my "semantic mapping" concept. Had I known that word would derail understanding I would have found a different word.

Okay, so it is neither surjective nor injective. (I have never heard the term "onto" before, probably because I learned all of this in German and surjective is more commonly used in research. And WTF, why does "1-to-1" mean "injective"? I was sure it would mean "bijective".)

But, that still makes it a *function* in the mathematical sense (for every one input there is only one possible output). That also seems to be the most common way to refer to this in English, at least Wikipedia talks about "(non-)injective *functions*" and "(non-)surjective *functions*", not "*mappings*".

So, if I may rephrase, what you are saying is that the semantic range is in some cases actually defined as the disjoint union of several pieces. Sure, but that's a detail of how *this particular* semantic range is defined; the general notion of a semantic range need not be concerned with that.

And there is not even a unique way to talk about the "partitions" of a "semantic range", e.g. the integers can be seen as "one thing" or as a disjoint union of three sets (strictly positive numbers, 0, strictly negative numbers). I see this as a meaningless distinction, what we care about is that there's a set of all integers and we can map lists of bytes into it; we do not care about the internal structure of that set.

And I fail to see how this relates to niches.

It seems I was unclear, I was not confused by the "function" part at all. I deal with functions all the time (though usually in type theory when I formalize my proofs in Coq ), but I rarely encounter the term "mapping" as a noun (I *do* speak about a function "mapping" things from somewhere to somewhere; the verb is fine, just the noun is weird for me). So just to be clear, I mean this notion of function, which is actually defined as a particular kind of relation.

So let me give another shot at saying what I was already trying to say last time.

Where our approaches differ is in how we view the connection between values and their representation:

- I view it primary as a
*relation*answering the question "are this semantic value and this concrete representation related". So I assume to be given both of them and then I give a yes/no answer. - If I understood correctly, you view it primarily as a
*function*answering the question "given a concrete representation, tell me the value it represents". Implicitly, your approach assumes there is a*unique*such value (otherwise it wouldn't be a function). And I suppose you also (implicitly) meant the function to be*partial*because not all representations represent a value (such as`0b11111111`

when thinking of`bool`

).

The information content is the same in both cases (at least when the value-representation-connection *can* be cast as a function; non-functional relations are conceivable here and they would not fit your view).

I believe that we are in agreement here. I wrote my initial post above in between afternoon concerts at a music festival, so did so without much attention to nomenclature. I don't actually care what terminology we end up choosing to describe these partitioned mappings. I avoided the term "function" (or at least tried to) because in computers functions are one-way, whereas mappings can be two-way. Since most readers of this forum are presumably not mathematicians, I hoped my avoidance of "function" would avoid confusion.

My post was triggered in large part by eddyb's statement about "invalid values". I thought that clarity was needed for this topic, and that in my view the issue was one of recognizing that "niches" resulted from partitioning both the domain and the corresponding range of the underlying mathematical "function".

**injective** simply requires that distinct elements in the domain map to distinct elements of the range.
**bijective** requires that there be a reverse mapping from the range to the domain and that both be **injective**. For the integers the mapping `f(x) = x * 2`

is injective but not bijective, because the range does not include the odd integers even though the domain includes all the integers.

I think "invalid value" is a bit too broad, because (for example) `42u32`

is a perfectly valid `u32`

but potentially an "invalid" value in certain contexts. It would be nice if the term conveyed this is more about bit representations rather than high-level invariants.

How about "bit-restricted"? So you have bit-restricted types and bit-restricted values. And the compiler/optimizer can exploit these bit-restrictions.

It's longer than "niche" and "invalid" but I think it's more self explanatory.

I still don't get this angle. A partition of a set (and many other things) is sorting all its elements into disjoint subsets *that cover the whole set*. But for the purposes we use the term "niche" for in Rust, the important bit is not that you can partition the elements which are there, but that some elements "aren't there", i.e. there is a "hole" in the set of representations -- bit strings which don't correspond to any mathematical value.

In your first post, you give the example of (I'll specialize for concretness) `Option<NonZeroU32>`

, partitioning the set of length-32 bit strings into `00...00`

(mapped to `None`

) and all the rest (mapped to `Some(n)`

for various `n`

). But niches, or whatever we want to call them, exist and are of interest no matter how or whether they are used elsewhere.

`NonZeroU32`

has, on its own, `00...00`

available as a niche (because it is not a valid bit string for that type). This is true even if nothing in the whole universe ever uses that niche -- for a more realistic example, it's quite unlikely anyone has ever compiled a Rust program that exhausts the entire niche of `char`

, which encompasses over four billion distinct bit strings.

Furthermore, it would be wrong to call `00...00`

a niche for the type `Option<NonZeroU32>`

. That type has no niches, every length-32 bit string represents a value of that type. When a previously invalid bit pattern is consumed for layout optimization, it *ceases to be a niche/invalid* (in the context of the type where it is used that way).

Your above points demonstrate why I tried to present this issue as one of partitioning the representation space `bN`

into a disjoint mapping – a multi-entry `match`

statement or `enum`

, if you wish. Perhaps others can explain/explore this subject better than I can.

I agree with this -- the niche is a property of the type *before* it has been wrapped by `Option`

/whatever to fill that niche. This fits well with the classic definition of niche as a recess in a wall for ornamental objects. Useful synonyms in this sense might be "gap" or "hole", though less colorful.

`0000…0`

and `1000…0`

are not two bit representations of the same value; +0 and −0 are actually distinct floating-point values, even though they compare equal. There are API-surface-level operations you can perform on them that expose this difference, like (+1) ÷ (−0) = −∞. (If you insist, you can consider floating-point numbers to form a bug-eyed line.)

The graph of the successor/predecessor relation provides you with a notion of connectedness that makes the set of all integers 'a single piece', but the set of non-zero integers 'two pieces', even though the set itself is discrete. A topology on a set (like one on ℝ ∪ { −∞, +∞ }) can likewise create a meaningful distinction between connected and non-connected subsets. And let's not forget coarse geometry, which can generalise connectedness as defined on graphs.

However, I agree that talking about the topology of bit representations is a distraction that only serves to perpetuate the confusion between bitstrings and values they represent. It's better to just consider them arbitrary, opaque labels with no structure other than identity. Besides, the successor relation of integers is not the only one or even the most natural one that might be imposed on bitstrings. One might just as well consider the Hamming-distance graph (edge between any two bitstrings of Hamming distance 1), or interpret a bitstring of 2*n* bits as a pair of *n*-bit integers and consider the adjacency relation on a 2D coordinate grid; both of which keep the set 'connected' after removing the all-zero bitstring.

Also, sometimes the abstract semantic space has no topology to speak of either. Consider this relation: `00`

↦ , `01`

↦ , `10`

↦ , `11`

↦ . Is it a 'unipartite' or a 'quadripartite' mapping? Would it be if I shuffled the suits?

I see. With my type theory hat on I would say functions are one-way even in mathematics, but relations can be viewed both ways... but we are digressing.

This is the point I am still not getting. Or maybe it's just terminology again. The way I view it, your function will simply not be defined on the bit pattern `0b11111111`

for type `bool`

-- so the niche would be characterized as being *outside* the domain of the function. (This is the same point @rkruppe made.) And I don't think the function's range has anything to do with it.

The maximal possible niche consists of those bit patterns that do not represent a "semantic value". The niche used by the compiler is some approximation of that (since exactly representing the maximal possible niche can be tricky). Usually the latter is what we refer to as "niche", and the former as "invalid values" or "values not satisfying the validity invariant".

Oh, so maybe *that* is the confusion here? @Tom-Phinney did you consider `0b00000000`

a niche of `Option<NonZeroU8>`

? That would be wrong, this is *not* what niche means. `0b00000000`

*is* a valid bitstring for that type. That type has no niche.

This is related to our ongoing struggle of finding good names for the "safety invariant" and the "validity invariant", as I called them in this blog post.

We could go deep into topology here but I don't see how that has any connection to Rust any more. The successor/predecessor relation is a rather arbitrary choice here. (So basically, I agree with what you are saying later.)

What is "them" here, the bit representations or the values or both?

I meant the former, but sometimes it's the right choice for values themselves as well (like in the card suits example). There's a reason `Ord`

is not an auto trait.

They are distinct in the wierd space defined by the `fpN`

encodings, but not in the extended real number field ℝ ∪ { −∞, +∞ } that they are supposed to represent.

This is why I used the term "multi-partite". If we consider my "mapping" as a Rust `match`

expression on the value of the `bN`

representation, your "function" corresponds to exactly one arm of the `match`

while my "mapping" is for the entire multi-arm `match`

, where the **domain** of the `match`

spans the entire range of possible values of the `bN`

representation.

`0b00000000`

is the only niche in the representation of `NonZeroU8`

. `Option<NonZeroU8>`

exploits that niche to provide an encoding for `None`

. Thus the representation of `Option<NonZeroU8>`

has no niches.

I'm a bit late to this party, but it's not clear to my why "niche" is a big deal? It provides good motivation for why we care about such representations (i.e., we want to stick data into an unused bit) and is, ultimately, specialized vocabulary that only the small minority that writes unsafe code will ultimately care about.

It's vocabulary, similar to the sense that "definition" and "declaration" are frequently used interchangeably, but are distinct terms in C++ standardsese. Of all the Rust vocabulary out there, I argue that this is a case where it *does not* matter.

This is why I used the term "multi-partite". If we consider my "mapping" as a Rust

`match`

expression on the value of the`bN`

representation, your "function" corresponds to exactly one arm of the`match`

while my "mapping" is for the entire multi-arm`match`

, where thedomainof the`match`

spans the entire range of possible values of the`bN`

representation.

But there *is* no value represented by `0b11111111`

as `bool`

. So what does the mapping output for that input?

With analogy to IEEE Floating Point, I classify all such representation values as "NotAValue" (`NaV`

). For `NonZeroU8`

the `b8`

representation `0b00000000`

is `NaV`

. For `bool`

all `b8`

representation values except `0b00000000`

and `0b00000001`

are `NaV`

. In my view, a representation value that is `NaV`

implies instant UB.

An `enum`

that incorporates the base type in one variant should be able to use one or more of those representation values that are `NaV`

in the base type as niches to provide representations for disjoint variants. In that case the enum representation could be the same size as the representation of the base type; it simply would have fewer representation values that are `NaV`

than the base type has. `Option`

is the most common `enum`

type in Rust that exploits such a niche to achieve same-size representation.

I'm a bit late to this party, but it's not clear to my why "niche" is a big deal? It provides good motivation for why we care about such representations (i.e., we want to stick data into an unused bit) and is, ultimately, specialized vocabulary that only the small minority that writes unsafe code will ultimately care about.

Not necessarily, you'd also want to use this to explain to beginner-to-intermediate users why e.g. `Option<&T>`

is a zero-cost abstraction.