[Pre-RFC] Add a new offset_of macro to core::mem

You're right, there isn't anything special about them. I just don't like to have both magic words and magic macros (and magic functions, come to think of it). It makes it easier for me to think about things if all the magic is in the keywords.

I already said in no uncertain terms that I'm not making claims that it will matter in the end. I do however want to make sure we're all on the same page about what exactly this implementation strategy entails.

Users can and do sometimes look at what macros expand to, using tools such as the third party command cargo expand. If they do that with code involving offset_of! implemented in the way you propose, they will see a peculiarity that other macros won't show (except asm! and global_asm!): the macro will appear to not be expanded at all. Other built-in macros such as include! or println! will reveal their secrets, but this one won't and can't.

I want to stress once again that if you want to argue this difference doesn't matter, I won't object. But please be aware that it exists and take it into account.

1 Like

I wish rust started moving towards metaprogramming and offsetof could be a first step. I. e. offsetof could be a const fn like:

const fn offset_of<T>(field: &str) -> Option<usize>;

It would be a compiler intrinsic.

Monomorphization, and thus full determination of types, is not complete until the end of MIR, just before code generation by LLVM or some other back end. Would you implement the metaprogramming as a multi-pass process, with the compiler executing its phases through MIR before recurring to the much earlier point where macros can modify the AST? What criteria would assure eventual convergence?

1 Like

Metaprogramming is a broad term. Generating new types on the fly would require multi-pass compiler, but for example implementing a function which serializes object to json (which is currently can only be done with proc macros) would not. The latter is also metaprogramming.

I'd love to see a full-featured compile-time reflection API based on const fn. As a one-off, though, it's probably not worth it...

7 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

See also

1 Like

I started a proof-of-concept implementation of this (implemented as a “magic” macro that is an AST node), but I got stuck at the HIR->MIR transition. I’d still love to implement this but might need some assistance/guidance on the HIR/MIR part, if anyone is willing.

Once I have a proof-of-concept I plan to submit the RFC for proper review. I just need to figure out how to finish the proof-of-concept… if we had generic pointers to fields (or something like them) I think this could be implemented as a real macro (const-friendly too).

1 Like

Also see this issue:

This is one of the crates posted in the OP. It is unsound in 3 different ways. Also, rustc depends on this crate (through crossbeam).

So, the current situation is that it’s not possible to write a strictly correct version of this as a user (due to this RFC not making progress), but while it would be possible to write a “mostly correct” version (@Amanieu posted one above but be aware that the comment there is wrong: references to uninitialized data are not fine) that’s not what people implementing this actually do.

Pretty sad, overall. :frowning: I think a built-in macro though should not do anything that we couldn’t later replace by a library macro once some form of said RFC lands. We need that RFC anway for other stuff such as initializing a struct field-by-field.

2 Likes

FYI it is technically possible to implement a sound offset_of! macro in pure Rust for repr(C) types, e.g.,

#[OffsetOf]
#[repr(C)] 
struct S {
   field0: Field0
   field1: Field1,
   ...
}

let field0_offset = offset_of!(S, field0);

Where OffsetOf expands to, e.g.,

impl $offsetsof_crate::OffsetOfTrait for S { type Offsets = __S_OFFSETS; }
#[doc(hidden)]
pub struct __S_OFFSETS; 
#[doc(hidden)]
impl __S_OFFSETS {
    pub const field0: usize = ...;
    pub const field1: usize = field0 + ...; 
    ...
}

and offset_of!(S, field0) then expands to:

let field0 =  <S as offset_of::OffsetOfTrait>::Offsets::field0;

Here, #[OffsetOf] just needs to compute the field offsets according to the C spec, using the const fn size_of, align_of, etc. to compute the offset of each field, according to the offsets of the previous fields.

This is obviously horrible, but depending on what the requirements of crossbeam are, it shouldn’t be too hard to implement. Sounds like a fun 1-hour project anyways.

2 Likes

Can confirm. I’ve done this. It works okay. Watch out for ZSTs, which have align 1 but zero size. Unsized types don’t work for obvious reasons and will result in some unfriendly error messages (but they’re clear enough to figure out what’s wrong).

Is this in a crate ? :slight_smile:

Watch out for ZSTs, which have align 1 but zero size.

Not always, e.g., [u16; 0] has align 2. If you have a repr(C) struct with ZSTs, each ZST might affect the offsets of the subsequent fields, e.g., if they introduce padding due to their alignment requirement. Consider:

struct S {
  x: u8,
  y: [u16; 0], // introduces 1 padding byte
  z: u8,
}

Here both S::y and S::z are at offset = 2.

Also, while C has (**) this behavior (and clang and gcc implement it), C++ implements zero-sized types in subtly different ways. E.g. if the example above was interfacing with C++, S::y would be correct at offset 2, but it would need to have a size of at least 1 (*), such that the correct offset for S::z would be 3, and not 2.

(*) and for C++ this is not even always the case, there are both tricks (Empty Base Class optimization) and attributes ([[no_unique_address]]) that allow giving such a type different layouts. (**) technically, C does not have ZSTs - its a language extension.

1 Like

Yes, but I didn't try to generalize it, and I ended up replacing it with an UB hack* that allowed me to support repr(Rust) structures and was still const-eval friendly. I've avoided linking to the exact code because I don't want RalfJung/others to close my UB loophole just yet...

Excellent point, and a good complication to keep in mind.

*I know, I know. I intend to eliminate all UB from my crate eventually. That's half the reason I wrote this pre-RFC in the first place.

I've looked up what the Rust reference has to say about this and found this section. I think while the description is fine, the pseudo code is faulty. Here it is with annotations:

struct.alignment = struct.fields().map(|field| field.alignment).max();

let current_offset = 0;

for field in struct.fields_in_declaration_order() {
    // Increase the current offset so that it's a multiple of the alignment
    // of this field. For the first field, this will always be zero.
    // The skipped bytes are called padding bytes.
    // [SIC!] This line does not achieve at all what the comment specifies.
    // Consider `current_offset = 1`, `field.alignment = 2` for example.
    current_offset += field.alignment % current_offset;

    struct[field].offset = current_offset;

    current_offset += field.size;
}

// [SIC!] Neither does this. This doubles the remainder modulo the 
// structure alignment, it doesn't align it.
struct.size = current_offset + current_offset % struct.alignment;

I'd find it very convenient if the standard library at least had formalized the correct implementation of this, and indeed alloc_layout_extra seems to have a similar goal.

Note that what the Rust reference says about this doesn't really matter at all. Since the types are repr(C), and we do guarantee that they match the layout of the equivalent C struct, what matters is what the C standard says about how that struct must be laid out (e.g. C18 6.7.2.1). An implementation solving this problem should follow that.

I really don't know why the reference says anything beyond "repr(C) structs are laid out according to the C rules of the target platform", linking the C standard, and maybe with an exception about how ZSTs are handled in some common platforms. That's what the UCGs do.

1 Like

The C reference is the normative part, but having a version of that that is actually readable seems like a good idea IMO.

3 Likes

Now I am actively worried, are you saying yours is worse than GitHub - Gilnaa/memoffset: offsetof for Rust ? :wink:

IMO it would be a good idea to at least centralize on one hacky way to implement offset_of despite it being UB. The memoffset crate seems to be a good candidate for that -- the maintainer is responsive to our concerns and suggestions.

So, until offset_of is in libstd, I think it would make sense to encourage people to use memoffset instead of their own home-grown solutions. Is anyone up for actively searching through Rust code bases out there, finding instances of an offset_of macro, and suggesting they use this create instead?

1 Like

Can’t we just provide a core::intrinsic::offset_of::<T>(field_name: &str) -> usize intrinsic ?

The compiler always knows the offsets, so that should work for all types (repr(Rust), repr(C), etc.) and be reliable. It doesn’t need to be a macro, but a macro can be implemented on top.