- Feature Name: offset_of
- Start Date: 2019-01-13
- RFC PR:
- Rust Issue:
Summary
Add a new macro named offset_of
to core::mem
that computes the offset of a type’s field, similar to C’s offsetof.
Motivation
The offset of a field is a regular need in FFI programming. Some examples include:
- Computing the offset of an Objective C ivar.
- Setting the offset of a vector in a vertex buffer (e.g., SceneKit).
- Interacting with an argument parser to store parsed arguments (e.g., FFmpeg’s
AVOption
). - Initializing an individual field of an uninitialized object.
It’s an open question whether merely creating a reference to uninitialized memory is undefined behavior. There are several crates that define their own offset_of
macro that rely on references to uninitialized objects (like memoffset). While this question remains open, it is unknown whether these crates exhibit well-defined behavior. Providing an offset_of
macro in Rust’s core library would provide a “blessed” way to compute the offset of a field that users could rely on both now and in the future. If it’s ever decided that a reference to uninitialized memory is undefined behavior, the core library’s offset_of
macro will still have well-defined behavior (unlike all user-level implementations).
Providing an offset_of
macro in the core library would assist FFI developers in writing correct code.
Guide-level explanation
The offset_of
macro takes a type and a field name and expands to a constant expression that evaluates to a usize
that gives the offset, in bytes, into a structure for a particular field. Some examples:
// You can use a regular struct with fields:
struct Struct {
foo: String,
bar: Vec<u32>,
}
static STRUCT_FOO_OFFSET: usize = offset_of!(Struct, foo);
static STRUCT_BAR_OFFSET: usize = offset_of!(Struct, bar);
// You can use a tuple struct:
struct TupleStruct(u8, u16, u32, u64);
const TUPLE_STRUCT_U8_OFFSET: usize = offset_of!(TupleStruct, 0);
const TUPLE_STRUCT_U16_OFFSET: usize = offset_of!(TupleStruct, 1);
const TUPLE_STRUCT_U32_OFFSET: usize = offset_of!(TupleStruct, 2);
const TUPLE_STRUCT_U64_OFFSET: usize = offset_of!(TupleStruct, 3);
// You can use a tuple:
const TUPLE_CHAR_OFFSET: usize = offset_of!((char, bool), 0);
const TUPLE_BOOL_OFFSET: usize = offset_of!((char, bool), 1);
// You can use a union (but all offsets will be zero):
union Union {
foo: f32,
bar: f64,
}
static UNION_FOO_OFFSET: usize = offset_of!(Union, foo);
static UNION_BAR_OFFSET: usize = offset_of!(Union, bar);
An enum
cannot be used with offset_of
since enum
s do not have accessible fields.
The offset_of
macro respects a type’s and field’s visibility:
mod inner_mod {
pub struct InnerStruct {
private_field: usize,
}
struct PrivateStruct {
pub field: usize,
}
}
// ERROR: field `private_field` of struct `inner_mod::InnerStruct` is private
// const BAD_EXAMPLE_0: usize = offset_of!(inner_mod::InnerStruct, private_field);
// ERROR: struct `PrivateStruct` is private
// const BAD_EXAMPLE_1: usize = offset_of!(inner_mod::PrivateStruct, field);
You also cannot use offset_of
to compute the offset of a field’s field (though a future RFC may alter that):
struct Inner {
inner_field: bool,
}
struct Struct {
inner: Inner,
}
// ERROR: expected one of `,` or `)`, found `.`
// const BAD_EXAMPLE_2: usize = offset_of!(Struct, inner.inner_field);
Reference-level explanation
In core::mem
:
macro_rules! offset_of {
($ty:ty, $field:ident $(,)?) => ({ /* compiler built-in */ });
}
The internal implementation of the offset_of
macro is generally equivalent to @eddyb’s offset_of
(notably, the built-in avoids going through Deref
), but differs in that:
- The compiler built-in is guaranteed to be safe.
- The compiler built-in is const-eval safe.
- The compiler built-in supports tuples.
- The compiler built-in supports unions.
Drawbacks
This increases the surface area of the core library (albeit in a minor way).
Rationale and alternatives
As touched on in the Motivation section, this is a regularly needed tool for FFI developers. It is used commonly enough that its presence in the core library would, I think, be warranted. Additionally, the implementation in core could be “blessed” in ways the user-level implementations cannot.
User-level implementations of offset_of
exist and are presently viable alternatives, but it is unclear whether they exhibit well-defined behavior. They also cannot (with a single macro) both avoid going through Deref
and support all of structs, tuples, and unions (since pattern matching is slightly different between them).
The syntax of offset_of
is debatable (e.g., offset_of!(Type.field)
), but I recommend we follow the historical form of C’s offsetof
since Rust FFI developers are likely familiar with it and there aren’t any huge advantages to alternative syntax forms.
The naming is also debatable (e.g., offset_of
vs offsetof
). Again, I recommend playing off of C’s offsetof
(so I will eschew anything crazy like byte_position_of
), but separating the words with an underscore feels more idiomatic for Rust (given align_of
, size_of
, etc.). Using a name similar to offsetof
makes searching the internet for the term slightly easier, and using a name with an underscore gives Rust a slight differentiation from C in search results.
Prior art
- C’s
offsetof
-
@eddyb’s
offset_of
. - memoffset crate.
- intrusive_collections crate.
- field_offse crate.
-
Discussion of
offset_of
andMaybeUninit
. - My own crate which implements its own
offset_of
macro, but I won’t link to it because it intentionally invokes undefined behavior and I don’t want @RalfJung to close my loophole until I can implement this in Rust’s core library
Unresolved questions
- Should
offset_of
work with arrays (e.g.,offset_of!([u8; 5], [3])
)? - Should
offset_of
work with a field’s field (e.g.,offset_of!(Struct, inner.inner_field)
)?
I’m inclined to say no to these right now. A future RFC could always expand offset_of
to support these (which should be backwards compatible).
Future possibilities
The offset_of
macro will likely be used in FFI-related code that also uses MaybeUninit in order to compute offsets to fields for initialization. Until the reference-to-uninitialized-memory issue is sorted out (and depending on the conclusion of that issue), offset_of
may be the only safe way to initialize individual fields in an uninitialized object. Thus, depending on future discussions, the offset_of
might be a prerequisite to writing correct Rust code in certain FFI-related code.