Equality of ZST values

giltho · November 6, 2023, 3:58pm

In theory, given a ZST, are two values of that ZST always equal? I would tend to say yes, because otherwise, it makes implementations such as Vec potentially wrong (there again, in a very theoretical framework).

But is there any use case in the wild of someone using ZSTs where it is not acceptable to consider all elements of that type to be equal?

Trying to formulate it slightly differently: Is there a sound constant T::VALUE which exists for ZST types (i.e. iff T::IS_ZST is true), and could I theoretically implement it in Rust as a compiler primitive?

jrose · November 6, 2023, 4:07pm

Those aren’t quite the same question, since a ZST can implement Eq but not expose any constructors (to limit the circumstances where it’s constructed), and it can also provide constructors but not implement Eq/PartialEq. The representation of a ZST value at run time is completely absent, so in a sense the literal answer to your question is std::mem::transmute(()), but I’m not sure that’s what you actually wanted.

What are you planning to do with this information?

giltho · November 6, 2023, 4:31pm

Yes I think answering that question would answer my question, or would even be stronger than my question (because it'd be saying all ZST representations are equal): is this function sound?

pub fn value<T>() -> T {
   if T::IS_ZST {
      unsafe { std::mem::transmute(()) }
   else {
      panic!("not a ZST")
   }
}

The problem is arising when attempting formal verification of std's implementation of Vec, and would require some background into techniques that exist today for that. Basically, it comes down to: if I successfully push a value into a vector, and then pop that value from the same vector, is it guaranteed to be the same value? I want to think that any sane specification of vec would ensure that, but it would not be ensured if there is not a unique value for a ZST.

PS: I'm using "value" in the exact same sense as that used in the UCGs or MiniRust

jrose · November 6, 2023, 4:47pm

So the construction thing really is relevant, because ZSTs are sometimes used as “permissions” based on their generic parameters. Equal representation (trivially) does not automatically make Copy and Default safe because of how the value is used. Here’s an example:

You shouldn’t be able to make a GhostToken value just because you know the type.

giltho · November 6, 2023, 4:57pm

Right, so the transmute here is unsound because of that. Although as I said, the question about the transmute is stronger than what I really want, because the value can be different between each type, and GhostCell<T> can have a different "unique value" from GhostCell<U>

But it does raise the question: if I push then pop a GhostCell from a vector, can I still use it? Supposedly, yes! And because nothing is actually written and read from memory, it does mean that all GhostCell (instantiated with the same generic arguments) should be "equal", since the value "created" by pop is still a valid value of this type

SkiFire13 · November 6, 2023, 5:37pm

I would say they are indistinguishable, or that their bit representations are equal.

the8472 · November 6, 2023, 6:01pm

A PartialEq impl that always returns false would be valid, you get that for f32::NAN after all.

giltho · November 6, 2023, 6:07pm

I don't think I am talking about partialEq which is a function that has an implementation which can be independent from the value itself.

I am talking about the ability to effectively substitute any value of a ZST with another value of the same ZST and asserting that this operation is ALWAYS sound and correct, and does not change the behaviour of the program, for all ZST, and all programs surrounding the operation

pitaj · November 6, 2023, 6:33pm

It may be better to use the term "matches" instead, as in match. Two ZSTs of the same type will always match identically.

trentj · November 6, 2023, 7:18pm

What kind of operation are you thinking of that would not be trivially correct (nor "library UB" as mentioned above/below)? Obviously if you push a value on to a Vec<Zst> and then pop a Zst off again that is fine (you can do it in entirely safe code, after all). But are they "the same" value? How would you know? If you have two references to the same ZST you can compare them for pointer equality (in safe code) and therefore have some sense of equality that says they are the same or different. But does that make them "another value"?

If you aren't talking about (Partial)Eq, then the question seems ill-defined.

scottmcm · November 6, 2023, 7:31pm

No.

Most trivially, ! is also a ZST.

More interestingly, see https://github.com/rust-lang/rust/pull/95385/files#diff-02049689ae3bb7ac98af3418ad8fdca219bfa1227cb6cad31afc72bdbae30e27R681-R717 for a bunch of text I wrote in the past about this.

Basically, inhabited ZSTs have no validity (language) invariant -- all possible values containing no bytes are vacuously valid -- but they can have a safety (library) invariant that makes conjuring them from nothing unsound.

ryanavella · November 6, 2023, 7:31pm

Are you essentially asking whether a caller of foo below can reliably determine if x and y are swapped? I think the answer is "no," because ZSTs have exactly one value, making their bit representations indistinguishable.

extern "Rust" { fn rand_bool() -> bool; }

pub fn foo(mut x: ZST, mut y: ZST) -> (ZST, ZST) {
    if unsafe { rand_bool() } {
        std::mem::swap(&mut x, &mut y);
    }
    (x, y)
}

Mokuz · November 6, 2023, 7:35pm

The following code is safe for any T

struct ZstFactory<T> {
    _lock: PhantomData<T>,
}

impl<T: 'static> ZstFactory<T> {
    fn new(t: T) -> Self {
        mem::forget(t);

        Self {
            _lock: PhantomData,
        }
    }
    fn get(self) -> T {
        unsafe {
            assert!(mem::size_of::<T>() == 0);

            ptr::read(mem::align_of::<T>() as *const T)
        }
    }
    fn get_copy(&self) -> T where T: Copy {
        unsafe {
            assert!(mem::size_of::<T>() == 0);

            ptr::read(mem::align_of::<T>() as *const T)
        }
    }
}

struct Foo;

fn use_foo(_: Foo) { .. }

fn a() {
    let f = Foo;

    use_foo(f);
}

fn b() {
    let f = Foo;

    let factory = ZstFactory::new(f);

    let new_f = factory.get()

    use_foo(new_f);
}

The functions a and b will behave identically regardless what other invariants Foo has or what use_foo does. In fn b the usage of ZstFactory doesn't have any effect on the programs behavior.

CAD97 · November 7, 2023, 12:33am

Phrasing it another way, because ZST values are zero bytes, their value representation is indistinguishable. There is only one possible sequence of zero bytes. While abstract bytes are more than just physical bytes, an abstract value is just the sequence of abstract bytes (as far as the language cares; library APIs can attach non-representational semantics to types they define).

The only way to distinguish two ZST values thus is by their address. If you have unpinned access to a value (i.e. you have T by value, you have &mut T, and/or T: Unpin), then that value is promised to be address insensitive (although it could become address sensitive after being pinned).

The thing that Vec or similar types need to be cognizant of w.r.t. ZSTs is to avoid conjuring them out of thin air. But it's perfectly valid to stash ZST values in and retrieve them from thin air. And the act of stashing/retrieving the value can be entirely implicit, since ZST reads/writes are no-ops (though it wouldn't be best practice).

giltho · November 7, 2023, 2:12pm

Great! That is basically what I had in mind in the first place, but I have been surprised in the past by interesting use cases so I thought I'd ask.

For more context, in my case, the goal is to prove safety invariants, and I am indeed only interested in knowing if I can conjure a ZST out of thin air if I only care about its validity invariant, i.e. it's the burden of my proof to manipulate the safety invariant.

Thank you everyone!

ratmice · November 7, 2023, 11:34pm

Not disagreeing with anything stated, but I find that the can in your statement is carrying some weight. Particularly i'm considering the case of NonNull<()>, pulled out of thin air. Such as this one in the standard lib https://github.com/rust-lang/rust/blob/7adc89b69b941afceadcf8609dd6b2999353e550/library/std/src/io/error/repr_bitpacked.rs#L65-L68

While your statement makes perfect sense to me, the only way I can comport these two cases is subverting the can by noticing that () in particular is ambiently constructable negating it's ability for a safety invariant.

So, my feeling is that there is some precondition which allows a ZST to be conjured from thin air, IS_ZST is not it, but it holds for some ZSTs without safety invariants like ()?

Another thing one could argue that a reference to a ZST is not itself a ZST, I don't think that has legs since () is copy. Anyhow we should make sure the rules don't preclude this NonNull case on what appears to me to be the other half of the dichotomy...

CAD97 · November 7, 2023, 11:58pm

I don't see how ptr::NonNull<()> is relevant, at all. You can cast any integer to a (usually dangling) pointer, and the pointee type of a pointer doesn't matter until you dereference it (or offset in units of pointee size).

.... But pointers to ZST aren't zero-sized, they're pointer-sized? What are you trying to get at here?

Also, ptr::NonNull is a pointer, not a reference. Conjuring a reference to sized type at an arbitrary address is UB^[1], but a pointer isn't; ptr::NonNull::<String>::dangling is even a safe function.

exact rules of exactly when reference dereferencability is asserted subject to change ↩︎

ratmice · November 8, 2023, 12:18am

Sorry about mixing terminology, my bad and it is especially relevant in this case. My recollection about the unsafety with regards to construction and dereferencing was backwards (For some reason I thought construction was unsafe, but deref was safe, and you are correct it is the other way around.) Thanks for correcting me.

scottmcm · November 8, 2023, 1:26am

The PR prompted me to finally make an ACP for my conjure_zst function:

github.com/rust-lang/libs-team

Add an `unsafe fn` for creating a ZST

opened 07:05PM - 07 Nov 23 UTC

scottmcm

T-libs-api api-change-proposal

# Proposal ## Problem statement People are often unsure about what the rul…es are for creating a ZST. See, for example, <https://internals.rust-lang.org/t/equality-of-zst-values/19815/3?u=scottmcm>, where the person made an unsound function. (Thankfully they made the thread to ask about it, rather than just assuming it was fine!) There's lots of ways that you could make a ZST: `mem::uninitialized::<Z>()`, `mem::zeroed::<Z>()`, `MaybeUninit::<Z>::uninit().assume_init()`, `*NonNull::<Z>::dangling().as_ptr()`, `mem::transmute_unchecked::<_, Z>(())`, etc. It would be nice to have one obvious API to point people at that would also serve as a place to put documentation about the safety and validity invariants of ZSTs, and thus the necessary preconditions for calling the method. ## Motivating examples or use cases `unsafe` code often has different paths for ZSTs where it wants to make items, such as https://github.com/rust-lang/rust/blob/9bd71afb90c2a6e0348cdd4a2b10a3bf39908f19/library/alloc/src/vec/into_iter.rs#L196-L197 This can show up in const generic code making arrays, too, where in an `if N == 0 { return []; }` doesn't work, so a "make a ZST" API could be nice to use there too: https://github.com/rust-lang/rust/blob/4654a910018cf0447db1edb2a46a0cae5f7dff8e/library/core/src/array/mod.rs#L801-L804 ## Solution sketch ```rust mod mem { pub unsafe fn conjure_zst<T>() -> T; } ``` The implementation of which would use (directly or indirectly) things like <https://doc.rust-lang.org/nightly/std/intrinsics/fn.assert_inhabited.html> to give good error messages for hitting `conjure_zst::<!>()`. And it would probably have a runtime trap (`ud2`) for it not actually being a ZST, since that's checkable at monomorphization time, and thus would be cheap in practice. The docs would have text like https://github.com/rust-lang/rust/pull/95385/files#diff-02049689ae3bb7ac98af3418ad8fdca219bfa1227cb6cad31afc72bdbae30e27R681-R717 ## Alternatives - Enforce ZST-ness with a post-mono error, like a `const { assert!(T::IS_ZST) }`. I didn't pick that one because that would keep it from being used in `if T::IS_ZST { ... }` blocks, like the array example. - Do nothing, since there's already lots of ways to make ZSTs. - Also add reference versions to get `&'static mut Z` and such. ## Links and related work A few more threads I found: https://internals.rust-lang.org/t/is-synthesizing-zero-sized-values-safe/11506?u=scottmcm https://internals.rust-lang.org/t/static-mutable-references-to-zsts/18330?u=scottmcm ## What happens now? This issue contains an API change proposal (or ACP) and is part of the libs-api team [feature lifecycle]. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months. [feature lifecycle]: https://std-dev-guide.rust-lang.org/development/feature-lifecycle.html ## Possible responses The libs team may respond in various different ways. First, the team will consider the *problem* (this doesn't require any concrete solution or alternatives to have been proposed): - We think this problem seems worth solving, and the standard library might be the right place to solve it. - We think that this probably doesn't belong in the standard library. Second, if there's a concrete solution: - We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.) - We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

system · February 6, 2024, 1:27am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Idea: Reference equality trait language design	5	552	April 19, 2023
[Solved] Trait object vtables and the Any trait Unsafe Code Guidelines	3	1239	May 2, 2019
[Idea] Refining const generic types language design	7	833	October 28, 2020
[Feature request] is_bitwise_copyable<T>() libs	20	1045	September 25, 2022
Sharing for a lifetime language design	18	3236	March 25, 2019

Equality of ZST values

Related Topics