How do exclusive references to zero sized types work?

This is mostly for my own curiosity, but how are exclusive references to zero sized type handled?

The following program

struct S;

fn main() {
    let a: &S = &S;
    let b: &S = &S;
    let c: &S = &S;
    eprintln!("{:?}", a as *const S);
    eprintln!("{:?}", b as *const S);
    eprintln!("{:?}", c as *const S);
    eprintln!();
    let a: &mut S = &mut S;
    let b: &mut S = &mut S;
    let c: &mut S = &mut S;
    eprintln!("{:?}", a as *mut S);
    eprintln!("{:?}", b as *mut S);
    eprintln!("{:?}", c as *mut S);
}

prints

0x560c8b7acd40
0x560c8b7acd40
0x560c8b7acd40

0x7ffd1d09a4b0
0x7ffd1d09a4b8
0x7ffd1d09a4c0

How does the compiler determine the "random" address to us for the reference? And how does it ensure that each &mut is distinct?

1 Like

Running in release mode on the playground gives

0x556a5d0f6030
0x556a5d0f6030
0x556a5d0f6030

0x7ffe9a6ffa68
0x7ffe9a6ffa68
0x7ffe9a6ffa68

I believe the 0x5 addresses are pointing into the .rodata from the binary, as the S value is promoted into static storage. The 0x7 addresses are pointing onto the stack since promotion doesn't occur. For some reason in debug mode LLVM is allocating 8 bytes for each value on the stack, maybe to improve debugability, while in release mode it coalesces them all into a single stack allocation (maybe of 0 bytes, would have to inspect the ASM to verify).

Wow, so the following program prints different things depending on --release:

struct S;
fn main() {
    let a = &mut S;
    let b = &mut S;
    println!("{}", std::ptr::eq(a, b))
}

That is ... interesting.

1 Like

There's no requirement that the addresses be distinct, only that the memory they refer to must not overlap. References to ZSTs refer to no memory.

Its common when implementing memory allocation in Rust to allocate ZSTs as the value of their alignment, which will be guaranteed to be a well-aligned non-null value.

8 Likes

I suspect they're just empty stack allocations, interleaved with the allocations for the reference variables themselves. You could print &a as *const _ etc. to confirm. Then in release, those references are probably just in registers, never on the stack at all -- but printing their address may disrupt that.

Since a read/write through a pointer to zst is not an actual read/write, there can't ever be any aliasing problems, I think. No writes, no aliasing problem.

One thing I’ve been wondering a few times: Couldn’t the type &T itself become zero-sized when T is zero-sized? This would completely avoid the situation of having unused “random” (garbage) pointer values being passed around and stored.

Discussed and rejected in https://github.com/rust-lang/rfcs/pull/2040

5 Likes