Discussion on offset_of!(..)


#1

Lifting out discussion from https://github.com/rust-lang/rfcs/pull/2421 on offset_of!.


Pre-Pre-RFC: Field offsets
#2

My proposed library-only implementation (to live in libcore):

macro_rules! offset_of {
    ($Struct:path, $field:ident) => ({
        // Using a separate function to minimize unhygienic hazards
        // (e.g. unsafety of #[repr(packed)] field borrows).
        // Uncomment `const` when `const fn`s can juggle pointers.
        /*const*/ fn offset() -> usize {
            let u = $crate::mem::MaybeUninit::<$Struct>::uninit();
            // Use pattern-matching to avoid accidentally going through Deref.
            let &$Struct { $field: ref f, .. } = unsafe { &*u.as_ptr() };
            let o = (f as *const _ as usize).wrapping_sub(&u as *const _ as usize);
            // Triple check that we are within `u` still.
            assert!((0..=$crate::mem::size_of_val(&u)).contains(&o));
            o
        }
        offset()
    })
}

Also to quote myself on implementing offset_of! like asm!:

I dislike having macros that expand to builtins that cannot be written with other syntax

(I really hope one day we can solve the asm issue - const generics + intrinsics is probably the way to go)


#3

Besides offset_of!, there should also be an offset_of_val! for DST structs e.g.

struct X<T: ?Sized>(u8, T);

let a: &X<dyn Debug> = &X(1, 2u16);
let b: &X<dyn Debug> = &X(3, 4u32);

assert_eq!(offset_of_val!(a, 1), 2);
assert_eq!(offset_of_val!(b, 1), 4);

#4

offset_of_val doesn’t need unsafe code but is (sadly) limited by the need to know the struct's name to be able to use pattern-matching to bypass Deref for field access:

macro_rules! offset_of_val {
    ($s:expr, $field:ident) => ({
        let s: &_ = $s;
        let o = (&s.f as *const _ as usize).wrapping_sub(s as *const _ as usize);
        // Triple check that we are within `*s` still.
        assert!((0..=$crate::mem::size_of_val(s)).contains(&o));
        o
    })
}

If we require specifying the struct name, we can recover the pattern-matching:

macro_rules! offset_of_val {
    ($s:expr, $Struct:path.$field:ident) => ({
        let s: &$Struct = $s;
        // Use pattern-matching to avoid accidentally going through Deref.
        let &$Struct { $field: ref f, .. } = s;
        let o = (f as *const _ as usize).wrapping_sub(s as *const _ as usize);
        // Triple check that we are within `u` still.
        assert!((0..=$crate::mem::size_of_val(&u)).contains(&o));
        o
    })
}

Then we can rewrite offset_of! to rely on offset_of_val!:

macro_rules! offset_of {
    ($Struct:path.$field:ident) => ({
        // Using a separate function to minimize unhygienic hazards
        // (e.g. unsafety of #[repr(packed)] field borrows).
        // Uncomment `const` when `const fn`s can juggle pointers.
        /*const*/ fn offset() -> usize {
            let u = $crate::mem::MaybeUninit::<$Struct>::uninit();
            offset_of_val!(unsafe { &*u.as_ptr() }, $Struct.$field)
        }
        offset()
    })
}

#5

Shameless plug: https://crates.io/crates/memoffset

The deref protection isn’t something I thought of, I’ll have to add it.


#6

I might be wrong but I still feel there’s not a lot of solid use cases for offset_of!. Every use case i can think of is using offset_of! as poor man’s “field reference”(think C++'s pointer-to-member operators).

I’d like to purpose something different. I’m thinking about some magic macro called “make_field_references_set”, which takes a type T and one or more field names 1, foo1, foo2, and generate a special type. To use it, you can call its inherent method with a &T or &mut T to get &T.foo1, &T.foo2, etc, returned as a tuple.

This way it not only solves the original field reference problem, it also solves the disjoint borrowing problem.


#7

You seem to be using mem::uninitialized, which is UB unless you can prove the struct does not contain uninhabited data.
I think it’s slightly safer to use integer addresses, because it will never try to create an uninhabited value, but that’s still problematic (cc @RalfJung).

AFAIK there is no way of doing offset_of soundly on stable Rust, because non-Copy unions aren’t stable yet.

Also, you’re missing the assert which means it’s easily misusable (e.g. through the lack of Deref protection).


#8

If you have two raw pointers that are guaranteed to point to the same object/allocation, I think casting to integers first or doing the difference directly is equivalent. However, for the foreseeable future, miri will only support directly subtracting the pointers, so that’s probably preferred if you want to use this in a const context.


#9

I meant starting with 0 (or, rather, NonNull::dangling()) instead of a reference to a stack slot, projecting a field reference out of that, and subtracting the resulting addresses.


#10

Is it desired for offset_of to be a compiler built-in, maybe?


#11

Effectively, a pointer-to-member mechanism, similar to C++? That does seem useful, though separate from the need for offset_of. But if we had a pointer-to-member mechanism, then we could use that as part of offset_of, by providing a way to convert a pointer-to-member into a numeric offset.