[Pre-RFC] Add a new offset_of macro to core::mem


Note that accepting this RFC is enough to provide an alternative to field initialization in MaybeUninit, and that RFC has just been proposed for merging.

So, almost certainly, a more ergonomic way to initialize fields (i.e, without offset_of) will exist, independent on where the discussion around references ends up.

That said, offset_of is certainly still useful, and I agree it should be in libcore. If you ignore the Deref problem, I think it can even be written as a library (with the above RFC accepted) already – but not with code that would be accepted in const context.

Well now that you summoned me into this thread, you know what’s going to happen. :stuck_out_tongue:

format_args! does not expand to anything, though – and that file contains some other examples.


This is just an internal implementation quirk caused by predating the proc macro API. Anyone who cares enough to put in the time could rewrite format_args! as a plain old proc macro (maybe losing some nice-to-have diagnostics). Furthermore, even today you can run cargo expand and see the regular token tree that it expands to, because it is expanded as a tokens -> tokens transform, just the way it’s hooked into macro expansion is magical.


I want to increase awareness of intrusive collections, which pretty much require an offset_of macro to work correctly. I come from the C-world, and intrusive collections are a core part of my toolbox. They are also something I want to see more of in rust as they make lots of stuff much easier to do. Just my USD$0.02.


Actually format_args! does expand to normal Rust code (the implementation is in libsyntax_ext). A better example would be asm! which expands into a custom AST node that can’t be represented with normal Rust code.


For the record, I have nothing against supporting offset_of!(Struct, inner.inner_field), offset_of!(Struct, inner.inner_field[3]), offset_of!(Struct, field[3]), or even offset_of!([u8; 5], [3]) (though I agree the syntax of that last one is questionable and not necessarily intuitive, and I agree that for now it should not be included in the RFC).

Thanks for pointing out GCC’s prior art of allowing them! I wasn’t aware of that, and it definitely makes me feel like I can include this in the RFC without feeling like it’ll inevitably lead to an endless bikeshed.


That’s… a good question I overlooked. I was originally thinking it would have the full definition of type available (and so could evaluate to a usize literal), but I realize now that macro expansion happens way too early for the macro evaluator to have the full definition of type. Here are a couple alternatives (neither of which are ideal, and neither of which I expect to be the final accepted solution; I’m just trying to get the idea-ball rolling):

Ignoring Deref

If we can ignore Deref for now, this could be implemented with RFC 2582 like so:

macro_rules! offset_of {
    ($ty:ty, $field:ident $(,)?) => ({
        let null = 0usize as *const $ty;
        $crate::mem::transmute::<_, usize>(&(*null).$field as *const _)

This should be const-eval friendly too, even on current rust. It also is compatible with sub-fields (e.g., offset_of!(Struct, inner.inner_field)). The only downsides I can see are:

  • It doesn’t avoid going through Deref. Avoiding Deref might require new special syntax.
  • It doesn’t work if &field results in a fat pointer. That could be fixed by doing proper pointer subtraction instead of transmuting (I know someone’s going to remind me that “pointers aren’t just integers”, which I’m well aware of). I should edit the post to fix that but it’s time to do my $dayjob.

Crazy per-type traits

If a special trait was automatically defined by the compiler for each type (where each type gets its own trait), then it could be:

macro_rules! offset_of {
    ($ty:ty, $field:ident $(,)?) => ({
        <$ty as CrazyBuiltInTraitCustomMadeFor<$ty>>::$field

That is, the trait CrazyBuiltInTraitCustomMadeFor is automatically defined by the compiler for each type, and it contains associated consts that share the name of the struct’s fields, where each const is the offset of the field within the type. I haven’t thought this through much, so there might be some major complications I’m overlooking (e.g., I’m not sure how this would work with sub-fields, nor how it should work with tuples which have integers for field names).

It could also be an actual type that has a custom impl for each type (so instead of doing <$ty as CrazyBuiltInTraitCustomMadeFor<$ty>>::$field in the macro, it would be CrazyBuiltInType<$ty>::$field).

Anyway, I’ll have to give the expansion of the macro more thought. Thanks for bringing that up.


This is UB: accessing a field asserts that the old and new pointer (computing the offset) are in-bounds of the same allocation. Your pointer is not in-bounds of any allocation.

There are some tricks to avoid Deref, like here and here.


asm! is not a macro; it is a macro-like syntactic construct with its own AST node.

Re: offset_of!, I strongly believe that handing out actual offsets is a Bad Idea; I think what we want is exactly T::*U (ptr-to-member) and the acompanying operator ->* from C++, though obviously with different syntax. Maybe a one-way usize cast might be ok, but I suspect that most uses of offsets never need to witness the internal value, whatever that might be.


That was even a keyword until recently… https://github.com/rust-lang/rfcs/pull/2421


Yes, asm! is magic rather than a macro and IMO that’s one of the many problems preventing its stabilization.

I appreciate that this may offer some additional type safety for many use cases, but pointers-to-member are also a significantly larger feature that’s significantly more difficult to design, so strategically I do not think it’s a good trade off to make offsetof dependent on it, even assuming pointers-to-members will ever be added to Rust (which seems quite uncertain). The ability to get the offset of a field at all is fundamental to a bunch of systems software and people are already missing it in practice and and badly emulating it. We should get a good solution into their hands quickly, rather than escalate to a more perfect solution.

Plus, some or all of the type safety can also be achieved in library code (struct Offset<Base, Field>(usize, PhantomData<Base>, PhantomData<Field>); with an unsafe constructor wrapped by a safe macro and safe functions &Base -> &Field, &mut Base -> &mut Field, etc.)


Hmm, I must have overestimated the guarantees of your RFC (2582). I assumed that &(*null).$field as *const _ would not be seen as a field access (as far as UB is concerned), and instead would be seen as a single atomic expression computing a pointer. If that’s not the case, then it would have to use MaybeUninit and pointer subtraction (which would be necessary anyways to support unsized fields).

Yes, but that syntax isn’t compatible with tuples or unions. I could drop tuple and union support from this RFC, but I was hoping to find a way to include them. Additionally, they create a reference to uninitialized memory (even with applying RFC 2582).


How objectionable would it be if offset_of! was also a macro-like syntactic construct with its own AST node?

Ultimately I can’t think of a good way to implement offset_of! that doesn’t rely on something at least as equally hacky. Here are all the ways I/others have mentioned here (the code in each bullet point is meant to be the body of the macro, with $Struct being the type and $field being the field):

  • The following doesn’t prevent you from going through auto-Deref:
    let uninit = std::mem::MaybeUninit::<$Struct>::uninitialized();
    let field = unsafe { &(*uninit.as_ptr()).$field as *const _ };
    let offset = (field as *const _ as usize).wrapping_sub(&uninit as *const _ as usize);
    This requires inventing some new syntax or mechanism to stop auto-Deref.
  • The following prevents auto-Deref, but it’s not compatible with tuples or unions, and it cannot support field.sub_field offsets:
    let uninit = std::mem::MaybeUninit::<$Struct>::uninitialized();
    let &$Struct { $field: ref field, .. } = unsafe { &*uninit.as_ptr() };
    let offset = (field as *const _ as usize).wrapping_sub(&uninit as *const _ as usize);
    It also creates a reference to uninitialized memory, which is one of the things I’m trying to avoid since it’s still an open question whether it’s well-defined behavior to do so.
  • Using an intrinsic that takes the field parameter as a string could work:
    let offset = internal_offset_of_intrinsic::<$Struct>(stringify!($field));
    (we could also split the field parameter string by subfields if we want to preserve span information and ; e.g., field.subfield("field", "subfield") and pass all of them to the intrinsic). The intrinsic would be #[doc(hidden)] so users don’t use it directly (with a note that doing so is UB).
  • Using an auto-generated per-type trait (or type), as I previously mentioned. This seems like a lot of work, and I don’t really like it.
  • Make offset_of! a macro-like syntactic construct with its own AST node. This feels closest to the intrinsic idea, but without the hack of stringifying the fields.
  • Make a new keyword or syntax for the offset (i.e. revive the offsetof keyword that was killed). This might also a new AST node (or not). I don’t really want to introduce new user-level syntax for this feature. Rust has had a lot of syntax churn over the past year, and I think that’s been reflected in some of the Rust 2019 posts that advocate slowing down (in addition to other factors).

While I don’t like the idea of having a macro that’s not really a macro, it’s starting to look appealing…


It is a single atomic expression computing a pointer. But it uses getelementptr inbounds for this computation, meaning the computation is UB if it is not within the bounds of the same object. That’s just how computing a pointer for field access works in LLVM. It helps a lot with alias analysis.


Extremely. As @rkruppe mentioned, this is a huge obstacle for inline and global assembly. What you want is for core::offset_of! to be a compiler-evaluated macro, like the file line and column macros. These are declared in libcore/macros.rs, but their bodies are ignored by the compiler. It is unclear whether this can be evaluated as a proc macro.

@rkruppe has a point that a library (hopefully, libcore) can abstract over numertical offsets, which is Not Wrong (though, I’ll mention that it means things like Offset<(u32, u32), u32> can’t be byte-sized… though that might not be a huge loss in the end.


We wouldn’t have this problem if the syntax were offset_of!(Foo, .FIELD). I’m not necessarily endorsing it, but it’s possible.

To offer a point of reference, in GCC and Clang offsetof(type, field) expands to… __builtin_offsetof(type, field), where __builtin_offsetof is a keyword that user code can invoke directly if it really wants. And it’s done (I believe) for pretty much the same reason that @RalfJung mentioned; i.e. the ‘traditional’ implementation of ((size_t)&((type *)0)->field) being UB.

I don’t think we should feel guilty about implementing this feature by adding yet another intrinsic to the language. And I think macro syntax fits it quite well, actually; its use cases are rare enough that it may be not worth the churn of reserving a keyword.

(I’m not distinguishing ‘macro-like syntactic constructs’ from ‘compiler-evaluated macros’ here. As far as user code is concerned, it’s a distinction without a difference.)


There is a small difference, in how it interacts with name resolution (as mentioned in the await thread).

That said, we can make a real compiler-evaluated-macro that evaluates to a surface-syntax-unexposed construct, eliminating that issue.


By the way, the reason offsetof is no longer a keyword, from RFC 2421:

If we are not using a keyword for anything in the language, and we are sure that we have no intention of using the keyword in the future, then it is permissible to unreserve a keyword and it is motivated.


Rationale for sizeof , alignof , and offsetof

We already have std::mem::size_of and similar which are const fn s or can be. In the case of offsetof , we would instead use a macro offset_of! .

In other words, there’s already an accepted RFC saying that offset_of! would be a better choice: deciding we need a keyword now would basically be ‘changing our mind’. Not that that’s necessarily bad. :man_shrugging:


Note that the keyword unreservation was done under the impression that it can be a completely ordinary macro, rather than a compiler-builtin with macro-like syntax. As @mjbshaw has pointed out at the start, the implementation assumed there is incorrect if one also wants to be able to get offsets of fields in in tuples and unions.


I don’t recall actually having such an impression when writing that RFC; I know eddyb noted their dislike for macros expanding to built-ins. Personally I don’t have any problem with that in this case.


Fair enough, but at least two people in that discussion did have that concern (me and them) and only didn’t raise it further because an apparently-sufficient library solution was found – and quite a few other people cared enough about that to debate the merits of that implementation. I think it’s clear that RFC PR would have gone very differently if that implementation hadn’t been posted or if we had realized it’s insufficient.