Offset_of_val!

I'd like to propose adding a new offset_of_val! macro, which is exactly like offset_of! except that the type argument is replaced by a value, and it operates on the type of said value.

That is to say, if we had a typeof operator in Rust, it would be equivalent to this:

macro_rules! offset_of_val {
    ($value:expr, $($fields:expr)+ $(,)?) => {
        offset_of!(typeof($value), $($fields)*)
    };
}

Problem statement

We discovered that the kernel's dma_read! macro is unsound. The macro let's you write code like this:

#[repr(C)]
#[derive(IntoBytes, FromBytes)]
struct MyStruct {
    field_1: u32,
}

// Dma<MyStruct> is basically an array of MyStruct values
// in volatile memory
let dma_alloc: Dma<MyStruct> = ...;
let x = dma_read!(dma_alloc[7].field_1);

and the macro then expands into essentially a volatile read of DMA memory. The normal api of Dma only lets you read the entire MyStruct value in one big read, but we might only want to read a single field in it. For this purpose we use a macro that figures out what offset (and size) [7].field_1 corresponds in the MyStruct array.

The macro is defined like this:

macro_rules! dma_read {
    ($dma:expr, $idx: expr, $($field:tt)*) => {{
        (|| -> ::core::result::Result<_, $crate::error::Error> {
            let item = $crate::dma::CoherentAllocation::item_from_index(&$dma, $idx)?;
            // SAFETY: `item_from_index` ensures that `item` is always a valid pointer and can be
            // dereferenced. The compiler also further validates the expression on whether `field`
            // is a member of `item` when expanded by the macro.
            unsafe {
                let ptr_field = ::core::ptr::addr_of!((*item) $($field)*);
                ::core::result::Result::Ok(
                    $crate::dma::CoherentAllocation::field_read(&$dma, ptr_field)
                )
            }
        })()
    }};
    ($dma:ident [ $idx:expr ] $($field:tt)* ) => {
        $crate::dma_read!($dma, $idx, $($field)*)
    };
    ($($dma:ident).* [ $idx:expr ] $($field:tt)* ) => {
        $crate::dma_read!($($dma).*, $idx, $($field)*)
    };
}

The problem is that the expansion contains the expression addr_of!((*item).field_1). This is normally well-defined, but the user could have defined MyStruct maliciously:

#[repr(C)]
#[derive(IntoBytes, FromBytes)]
struct MyStruct {
    field_2: u32,
}

struct SomeOtherType {
    field_1: u32,
}

impl Deref for MyStruct {
    type Target = SomeOtherType;
    fn deref(&self) -> &SomeOtherType {
        todo!()
    }
}

with this type, using dma_read! with Dma<MyStruct> results in UB because addr_of!((*item).field_1) notices that although MyStruct does not have a field called field_1, it derefs to a type that does. Therefore, it's equivalent to this:

  1. Create a &MyStruct from the *item expression. (dereferencing a raw pointer)
  2. Call MyStruct::deref()
  3. Invoke addr_of! to get a raw pointer to field_1 of the thing returned by deref().

And this code already invokes UB in step 1 because creating a reference to volatile memory is not legal. Note that step 1 is an unsafe operation because it's a deref of a raw pointer, but it's inside of addr_of! which also requires an unsafe block, so this passes.

Now, one way to solve the above problem is to redefine dma_read! so you call it as dma_read!(MyStruct, dma_alloc[7].field_1). This allows you to replace addr_of! with offset_of!, which triggers a compilation error if deref is involved. Unfortunately, it's quite inconvenient to specify the type every time you invoke dma_read!.

Deref ambiguity hack

Technically there is a way to work around the above. You can use this trick to verify that the type does not implement Deref. This results in a pretty complex macro, and probably hurts compile times because of a bunch of weird logic just to check for this, but it does work. (Note you have to be careful if you want to support a.field.field because not only do you have to check typeof(a): !Deref, but also typeof(a.field): !Deref.)

How offset_of_val! solves this

By emitting an invocation of offset_of_val!(*item, field_1), the compiler can compute the offset for us, and the compiler can be implemented in a way such that deref is not a problem (like it already does for the usual `offset_of!).

Other problems solved by this

The macro could be used by crates that wish to perform an addr_of! operation using pointer::wrapping_add semantics instead of pointer::add semantics.

Type of first argument

I propose that the first argument should be a place expression, and that it should operate on the type of that place. So to operate on field_1 of an *mut Foo, you need offset_of_val!(*ptr, field_1). I think this is the most general way to define it without needing to create a reference.

Naming

This naming is analogous to other methods also defined in core::mem.

  • size_of / size_of_val
  • align_of / align_of_val

Alice

16 Likes

imo it would be better to just add typeof to the language, this is far from the only macro where you need the type of some expression.

15 Likes

Works for me!

2 Likes

I'd kinda just want to have this operator: §3.2.1 Path Offset Syntax - Rust's Unsafe Pointer Types Need An Overhaul - Faultlore which should solve your use case, too, like offset_of_val!(expr, field1.field2.field3) would simply be the offset from the pointer p1 = &raw const expr to the pointer p2 = p1~field1~field2~field3.

9 Likes

@steffahn Okay but that also seems like a much harder sell!

1 Like

I'm not sure about that. Both typeof and ~ would be new operators, but typeof feels like it would step on the toes of reflection work.

I mean a harder sell than offset_of_val!. I can understand that typeof may be tricky.

2 Likes

Seems different. I'd expect the offset operator to run in-bounds including a needed unsafe block for pointers. Maybe stricter than the simple pointer arithmetic, considering fields at offset 0 allow null-pointers if executed via an equivalent byte_offset call ever since we carved out the "If the computed offset is non-zero, " exception for its predicate; but maybe we don't want that especially for a non-ZST field where its past-the-end pointer should also be unsafey asserted to be in-bounds.

Iirc offset_of_val was also contentious because it wasn't certain if in-bounds should be asserted or not.

I mean, I guess for the syntax part of an operator spelled ~, yes, that’s a harder sell. But I think it could start out as some kind of macro, too.

Indeed I’m not sure whether it should be unsafe, either. IMO it would be a kinda really nice thing if it wasn’t an unsafe operation to project a struct pointer to a field pointer this way actually. This would probably mean wrapping behavior in the general case. Unfortunately, I’m not really sure at all off the top of my head exactly what degree of negative performance implications this may or may not have.

Rust used to have typeof implemented in the parser, but it was removed because it wasn't implemented in the type checker and was getting annoying to maintain. I don't know what implications that has for the chances of it being added back again.

2 Likes

That runs into an interesting problem in both cases. There is no option except field privacy to controls access (which would imply the ability to safely create references to the interior of a struct) even though these macros only care about the layout. Several transparent wrappers (relevant for this in particular: MaybeUninit and UnsafeCell) however can for this reason only return raw pointers to their interior memory through special method which is incompatible with the notion of pure paths or indeed pattern syntax (my personal wish item for this) as a solution to any of this. Should field traversal be able to project &UnsafeCell<Struct> onto &UnsafeCell<Field> and compute offsets, since that wrapper defines and preserves the mode of access to the memory? But also maybe that wouldn't be desirable. We'd want traversal and offset to be an Applicable family over pointer types &_, which overlaps with the family &UnsafeCell<_> so we can't have both.

Part of the reason I went for offset_of_val! is that it sidesteps questions about whether the operation should be unsafe or use wrapping pointer offsets.

6 Likes

IIRC typeof in C++ is quote complicated because it doesn't actually run the expression you put inside of it. Does someone remember more about why that is tricky? Is that a problem we could avoid in Rust with a different design?

offset_of_val! would likely have the same problems, and overall seems less appealing because it is more special-cased.

I wonder if a good alternative would be to have an of_val constructor for Type (can be defined by users already) and then use reflection to check for the presence of a field?

I can imagine many more problems with typeof than I can with offset_of_val!. For example, perhaps it introduces a lot of problems with type inference? Or perhaps it gives a way to name types that should not be name-able. Etc.

An offset_of_val! macro has none of those challenges. If can already write <expr>.field today, so it's hard to imagine that looking up said field is difficult or causes type inference problems.

1 Like

The ony C++ trickiness I can recall is that the expression, despite not running, still causes all necessary instantiations to happen which may run all forms of constexpr in now newly required items (and probably more). That alone can influence practically anything though. Just remember you can build a compile-time counter like there. Combining those properties yields practical nonsense.

Other than the rough outline of the problem I don't remember more concretely what rules unevaluated expression specifically follow. It's probably not helpful to attempt a precise explanation of the result in that example because GCC and Clang disagree.

Edit: also decltype(x) works different if x is a variable rather than an expression so decltype(x) and decltype((x)) are quite different. But that's .. just a little odd.

1 Like

I agree with this. It feels like a better path is through something like bikeshed_type_info_of_val and reflecting upon information about fields it provides.

Okay -- yeah Rust does not allow compile-time "state" (that would be fundamentally incompatible with our query system anyway) so I guess we are good here.

I suppose one tricky detail with an offset_of_val! is potentially being able to compute the offset of unsized fields that offset_of! currently recommends using addr_of! for instead:

// To obtain the offset of a !Sized field, examine a concrete value
// instead of using offset_of!.
let value: Struct<Align4> = Struct { a: 1, b: Align4(2) };
let ref_unsized: &Struct<dyn Debug> = &value;
let offset_of_b = unsafe {
    (&raw const ref_unsized.b).byte_offset_from_unsigned(ref_unsized)
};
assert_eq!(offset_of_b, 4);

could be

let value: Struct<Align4> = Struct { a: 1, b: Align4(2) };
let ref_unsized: &Struct<dyn Debug> = &value;
let offset_of_b = offset_of_val!(*ref_unsized, b);
assert_eq!(offset_of_b, 4);

None, really. It was implemented badly and essentially useless, that's why it was removed. If it's an actual feature and implemented properly none of those problems apply.

1 Like

IMO Not running the code inside of typeof seems perfectly normal and expected. IMO typeof being able to be used on unnameable types is a feature, not a bug. We can easily avoid the decltype((x)) != decltype(x) issue that C++ has because Rust doesn't have C++ references that implicitly disappear when used.

2 Likes