Equality of ZST values

In theory, given a ZST, are two values of that ZST always equal? I would tend to say yes, because otherwise, it makes implementations such as Vec potentially wrong (there again, in a very theoretical framework).

But is there any use case in the wild of someone using ZSTs where it is not acceptable to consider all elements of that type to be equal?

Trying to formulate it slightly differently: Is there a sound constant T::VALUE which exists for ZST types (i.e. iff T::IS_ZST is true), and could I theoretically implement it in Rust as a compiler primitive?

Those aren’t quite the same question, since a ZST can implement Eq but not expose any constructors (to limit the circumstances where it’s constructed), and it can also provide constructors but not implement Eq/PartialEq. The representation of a ZST value at run time is completely absent, so in a sense the literal answer to your question is std::mem::transmute(()), but I’m not sure that’s what you actually wanted.

What are you planning to do with this information?

Yes I think answering that question would answer my question, or would even be stronger than my question (because it'd be saying all ZST representations are equal): is this function sound?

pub fn value<T>() -> T {
   if T::IS_ZST {
      unsafe { std::mem::transmute(()) }
   else {
      panic!("not a ZST")
   }
}

The problem is arising when attempting formal verification of std's implementation of Vec, and would require some background into techniques that exist today for that. Basically, it comes down to: if I successfully push a value into a vector, and then pop that value from the same vector, is it guaranteed to be the same value? I want to think that any sane specification of vec would ensure that, but it would not be ensured if there is not a unique value for a ZST.

PS: I'm using "value" in the exact same sense as that used in the UCGs or MiniRust

2 Likes

So the construction thing really is relevant, because ZSTs are sometimes used as “permissions” based on their generic parameters. Equal representation (trivially) does not automatically make Copy and Default safe because of how the value is used. Here’s an example:

You shouldn’t be able to make a GhostToken value just because you know the type.

7 Likes

Right, so the transmute here is unsound because of that. Although as I said, the question about the transmute is stronger than what I really want, because the value can be different between each type, and GhostCell<T> can have a different "unique value" from GhostCell<U>

But it does raise the question: if I push then pop a GhostCell from a vector, can I still use it? Supposedly, yes! And because nothing is actually written and read from memory, it does mean that all GhostCell (instantiated with the same generic arguments) should be "equal", since the value "created" by pop is still a valid value of this type

2 Likes

I would say they are indistinguishable, or that their bit representations are equal.

A PartialEq impl that always returns false would be valid, you get that for f32::NAN after all.

I don't think I am talking about partialEq which is a function that has an implementation which can be independent from the value itself.

I am talking about the ability to effectively substitute any value of a ZST with another value of the same ZST and asserting that this operation is ALWAYS sound and correct, and does not change the behaviour of the program, for all ZST, and all programs surrounding the operation

It may be better to use the term "matches" instead, as in match. Two ZSTs of the same type will always match identically.

What kind of operation are you thinking of that would not be trivially correct (nor "library UB" as mentioned above/below)? Obviously if you push a value on to a Vec<Zst> and then pop a Zst off again that is fine (you can do it in entirely safe code, after all). But are they "the same" value? How would you know? If you have two references to the same ZST you can compare them for pointer equality (in safe code) and therefore have some sense of equality that says they are the same or different. But does that make them "another value"?

If you aren't talking about (Partial)Eq, then the question seems ill-defined.

No.

Most trivially, ! is also a ZST.

More interestingly, see https://github.com/rust-lang/rust/pull/95385/files#diff-02049689ae3bb7ac98af3418ad8fdca219bfa1227cb6cad31afc72bdbae30e27R681-R717 for a bunch of text I wrote in the past about this.

Basically, inhabited ZSTs have no validity (language) invariant -- all possible values containing no bytes are vacuously valid -- but they can have a safety (library) invariant that makes conjuring them from nothing unsound.

11 Likes

Are you essentially asking whether a caller of foo below can reliably determine if x and y are swapped? I think the answer is "no," because ZSTs have exactly one value, making their bit representations indistinguishable.

extern "Rust" { fn rand_bool() -> bool; }

pub fn foo(mut x: ZST, mut y: ZST) -> (ZST, ZST) {
    if unsafe { rand_bool() } {
        std::mem::swap(&mut x, &mut y);
    }
    (x, y)
}
3 Likes

The following code is safe for any T

struct ZstFactory<T> {
    _lock: PhantomData<T>,
}

impl<T: 'static> ZstFactory<T> {
    fn new(t: T) -> Self {
        mem::forget(t);

        Self {
            _lock: PhantomData,
        }
    }
    fn get(self) -> T {
        unsafe {
            assert!(mem::size_of::<T>() == 0);

            ptr::read(mem::align_of::<T>() as *const T)
        }
    }
    fn get_copy(&self) -> T where T: Copy {
        unsafe {
            assert!(mem::size_of::<T>() == 0);

            ptr::read(mem::align_of::<T>() as *const T)
        }
    }
}

struct Foo;

fn use_foo(_: Foo) { .. }

fn a() {
    let f = Foo;

    use_foo(f);
}

fn b() {
    let f = Foo;

    let factory = ZstFactory::new(f);

    let new_f = factory.get()

    use_foo(new_f);
}

The functions a and b will behave identically regardless what other invariants Foo has or what use_foo does. In fn b the usage of ZstFactory doesn't have any effect on the programs behavior.

2 Likes

Phrasing it another way, because ZST values are zero bytes, their value representation is indistinguishable. There is only one possible sequence of zero bytes. While abstract bytes are more than just physical bytes, an abstract value is just the sequence of abstract bytes (as far as the language cares; library APIs can attach non-representational semantics to types they define).

The only way to distinguish two ZST values thus is by their address. If you have unpinned access to a value (i.e. you have T by value, you have &mut T, and/or T: Unpin), then that value is promised to be address insensitive (although it could become address sensitive after being pinned).

The thing that Vec or similar types need to be cognizant of w.r.t. ZSTs is to avoid conjuring them out of thin air. But it's perfectly valid to stash ZST values in and retrieve them from thin air. And the act of stashing/retrieving the value can be entirely implicit, since ZST reads/writes are no-ops (though it wouldn't be best practice).

7 Likes

Great! That is basically what I had in mind in the first place, but I have been surprised in the past by interesting use cases so I thought I'd ask.

For more context, in my case, the goal is to prove safety invariants, and I am indeed only interested in knowing if I can conjure a ZST out of thin air if I only care about its validity invariant, i.e. it's the burden of my proof to manipulate the safety invariant.

Thank you everyone!

Not disagreeing with anything stated, but I find that the can in your statement is carrying some weight. Particularly i'm considering the case of NonNull<()>, pulled out of thin air. Such as this one in the standard lib https://github.com/rust-lang/rust/blob/7adc89b69b941afceadcf8609dd6b2999353e550/library/std/src/io/error/repr_bitpacked.rs#L65-L68

While your statement makes perfect sense to me, the only way I can comport these two cases is subverting the can by noticing that () in particular is ambiently constructable negating it's ability for a safety invariant.

So, my feeling is that there is some precondition which allows a ZST to be conjured from thin air, IS_ZST is not it, but it holds for some ZSTs without safety invariants like ()?

Another thing one could argue that a reference to a ZST is not itself a ZST, I don't think that has legs since () is copy. Anyhow we should make sure the rules don't preclude this NonNull case on what appears to me to be the other half of the dichotomy...

I don't see how ptr::NonNull<()> is relevant, at all. You can cast any integer to a (usually dangling) pointer, and the pointee type of a pointer doesn't matter until you dereference it (or offset in units of pointee size).

.... But pointers to ZST aren't zero-sized, they're pointer-sized? What are you trying to get at here?

Also, ptr::NonNull is a pointer, not a reference. Conjuring a reference to sized type at an arbitrary address is UB[1], but a pointer isn't; ptr::NonNull::<String>::dangling is even a safe function.


  1. exact rules of exactly when reference dereferencability is asserted subject to change ↩︎

1 Like

Sorry about mixing terminology, my bad and it is especially relevant in this case. My recollection about the unsafety with regards to construction and dereferencing was backwards (For some reason I thought construction was unsafe, but deref was safe, and you are correct it is the other way around.) Thanks for correcting me.

The PR prompted me to finally make an ACP for my conjure_zst function:

5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.