Byte-wise fat pointer arithmetic

I have a potentially unsized type Record defined as follows:

#[repr(C)]
struct Record<H, T: ?Sized> {
    header: H,
    data: T
}

These are allocated (boxed) using concrete (i.e. sized) types, but only pointers to the data field are ever handed out to users. By the time a record needs to be de-allocated, the handed out pointer may or may not have lost it's concrete type information, i.e. *mut T may be a fat pointer.

Given that Record is #[repr(C)] and has only two members, it is fairly ease to calculate the offset from the (unsized) data field to the header aka the top of the struct:

impl<H, T: ?sized> Record<H, T> {
    unsafe fn header_from_data(data: *mut T) -> *mut H {
        // align_of_val should really just accept a pointer but alas...
        let data_align = mem::align_of_val(&*data);
        // if data is a fat pointer, the cast to u8 turns it into a thin one
        (data as *mut u8).sub(Self::data_offset(data_align)).cast()
    }

    const fn data_offset(data_align: usize) -> usize {
        // matches the layout algorithm used by `rustc` for C-like structs
        let offset = mem::size_of::<H>();
        offset + offset.wrapping_neg() % data_align
    }
}

However, in order to de-allocate a Record again, it would be necessary to cast the *mut H into a *mut Record<T, H>, but this is not possible, since the latter is potentially a fat pointer. However, the required metadata (or vtable pointer) is already present in the (maybe fat) *mut T pointer. Unfortunately, all pointer arithmetic operations defined on pointer types like add/sub or offset only work with count differentials instead of bytes, and the usual approach of casting to a pointer to a type with size 1 like u8 isn't applicable here, since this cast would discard the vtable and make the pointer thin again. Interestingly, I can't even use the unstable TraitObject type here, since the passed *mut T is not necessarily a fat pointer.

I ended up writing the following function:

unsafe fn record_from_data(data: *mut T) -> *mut Self {
    // pointer is cast to a pointer to an unsized `Record<H, dyn ..>`, but
    // in fact still points at the record's `data` field
    let mut ptr = data as *mut Self;
    // header is the "correct" thin pointer, since the header field is at
    // offset 0
    let header = Self::header_from_data(data) as *mut ();
    {
        // create a pointer to the fat pointer's "data" half
        // this is the dangerous part, since the layout of fat pointers to
        // trait objects is not actually guaranteed in any way, but is
        // currently implemented this way
        let data_ptr = &mut ptr as *mut *mut Self as *mut *mut ();
        *data_ptr = header;
    }

    ptr
}

Since the layout of fat pointers is technically not specified and may change, this solution is of course potentially brittle and perhaps even UB, although I am not sure, since it is in fact how it is currently (internally) implemented and I somehow don't think the pointer part of a fat pointer will ever not be the first field of a fat pointer.

I am fairly sure there is nothing in the language currently that would solve my problem in a way that doesn't make assumptions about the layout of fat pointers, so I was wondering how the language or standard library could be amended to solve this issue. One solution would be to add functions that allow byte-wise pointer arithmetic on pointers to types of arbitrary sizes. Another could be function like this:

impl<T: ?Sized> *mut T {
    pub fn as_thin_ptr(&mut self) -> &mut *mut () { /* ... */}
}

Although is probably too specialized with not enough general use to warrant adoption into the standard library.

4 Likes

That is an embarrassingly clever way to handle this.

INTERNALLY to the standard library, where it is allowed to use knowledge of the compiler's laying out of fat pointers, an identical problem is solved for Arc/Rc.

It's done the same way, roughly:

unsafe fn set_data_ptr<T: ?Sized, U>(mut ptr: *mut T, data: *mut U) -> *mut T {
    ptr::write(&mut ptr as *mut _ as *mut *mut u8, data as *mut u8);
    ptr
}

(Reminder: you as a library author aren't permitted to do this.) <*mut T>::as_thin_ptr would allow doing the exact same thing, but without any reliance on compiler-internal details:

fn set_data_ptr<T: ?Sized, U>(mut ptr: *mut T, data: *mut U) -> *mut T {
    *ptr.as_thin_ptr() = data as *mut _;
    ptr
}

Whether the meaning of such is clear is left to the reader to decide.

.... But, I think adding offsetting methods that work bytewise rather than elementwise is probably the "better" std addition here. It allows saying what you mean, rather than how you accomplish it. Most of the uses of set_data_ptr are of the form

set_data_ptr(ptr as *mut Rc<T>, (ptr as *mut u8).offset(offset)

anyway, so this is just a roundabout way of getting [wrapping_]offset_bytes.

1 Like

I am afraid adding byte-wise methods for pointer arithmetic to the standard library will be too controversial, given that offset, offset_from, wrapping_offset, wrapping_offset_from, add, wrapping_add, sub, wrapping_sub (times two for both *const T and *mut T) would all require byte-wise counterparts if this were to be consistent. Since casting to a u8 and using the already existing method achieves the desired purpose for the vast majority of cases it also seems like overkill.

I think proposing to add a single function like as_thin_ptr or [get|set]_data_ptr which covers this specific use-case (i.e. byte-wise pointer arithmetic for fat pointers) has a higher chance of succeeding. I am not too familiar with the accepted process of contributing to the standard library, so is this something that warrants writing an RFC or would a plain PR be considered sufficient?

As I understand it, you're basically trying to implement a container_of-style operation here. So why not have that as a primitive instead?

What kind of primitive do you have in mind? This kind of operation requires pointer arithmetic, which as I have laid out, does not work for fat pointers.

The operation is: given a structure type T, member m of type U and a pointer p to U, compute a pointer q to T such that the pointer to (*q).m is p. This operation is known as container_of in the Linux kernel and implemented basically as

#define container_of(T, m, p) \
    ((T *)((char *)(p) - offsetof(T, m)))

It seems that's pretty much what you're trying to accomplish above. If this operation were exposed as a primitive in its own right, I think it may be a little bit safer to use than fully general pointer arithmetic. And given that it has an existing use case, it may be even easier to argue for than your proposal.

2 Likes

What do you mean by "primitive"? Certainly this doesn't need to be a language feature given that it's demonstrated here that it's easily solvable in the library…

Well, it isn’t, actually. As @CAD97 said, the standard library can get away with this only because it is allowed to exploit internal implementation details of the compiler and is updated in tandem when the latter change. For all other code, this is UB. (For comparison: on i386, TCC implements va_start using pointer arithmetic beyond the last parameter, just like you would expect from a naïve, non-optimizing compiler. But per the C standard, it’s still UB to do the same in your own code.)

I’d imagine having container_of! as a construct with the surface syntax of a macro; whether the language implementation provides it as a opaque built-in or some otherwise-UB forbidden black magic trickery is an implementation detail that user code should not concern itself with.

2 Likes

Such a container_of! macro could probably be just as well be implemented by some 3rd party library, which is generally the preferred way from what I can tell. This would, however, first require the &raw operator to become stable, which is what currently blocks the offset_of! macro in the memoffset crate to be implemented without UB. But the standard library would also have to provide some means to do pointer arithmetic on fat pointers, or otherwise any implementation would be restricted to sized types only. I think it makes sense if the library would only provide the most basic building blocks for such a rather extraordinary use case.

I am working on a PR that would add the following functions to *mut T and *const T:

impl<T: ?Sized> *mut T {
    pub fn set_ptr<F: FnOnce(*mut ()) -> *mut())>(mut self, f: F) -> Self {
        let thin = ptr as *mut ();
        let ptr = &mut ptr as *mut *mut T as *mut *mut ();
        unsafe { *(&mut *ptr) = f(thin) };
        ptr
    }
}

I am not entirely sure whether the closure should take and return a *mut () or a *mut u8. The former is less opinionated, but the latter is likely more fitting for the majority of use-cases (which are probably fairly few themselves) for this function.

I think, same as offset_of by the way, introducing such operation as a very specialized primitive has very high costs attached to it. The pointer operation is 'just' a stabilization of a very small part of a core library feature (namely: there exists some part of pointers that is bit-compatible with a pointer-sized field, the FnOnce approach exposes even less). In comparison, the operation as a primitive goes through MIR, interacts with type layout, with field privacy, paths, needs magic to be an unsafe-to-use macro, has unknown consequences with future possibilities for unsized types, and doesn't generalize in any way outside that use case. I don't think introducing such a primitives to make some code slightly safer to use is worth the trade-off, when safe abstractions around other unsafe library features might yield very similar improvements.

What counts as a ‘basic building block’ is relative. container_of is not particularly esoteric: it’s a ‘basic building block’ of OOP idioms in the Linux kernel. I believe it is also sufficient to implement (an equivalent of) downcasting with multiple inheritance in C++. What you call a ‘basic building block’, I may instead call a ‘private implementation detail’: imagine an alternative implementation of Rust that inserts extra padding between structure fields in order to catch buffer overflow errors. Unsafe code written in terms of container_of and offset_of could be used unchanged with such an implementation. Unsafe code that manually performs arithmetic on sizes of fields would have to be re-written. (You can imagine I also compile C code in this mode, so #[repr(C)] will not save you.)

I am also puzzled as to why you are so dismissive of your own motivating use case: what you are doing would be easily accomplished by container_of on data to get a pointer to Record<T, H> and immediately taking the address of the header field.

Yes. But if the compiler does not provide this operation, it is unsafe user code that is going to have to make assumptions about ABI and do the work of reasoning about all those things to ensure what it does is not UB. All that complexity is still going to be carried by something, and the compiler is in a much better position to do it.

With manual pointer arithmetic, user code has to assume a particular ABI for the type, and pray the compiler doesn’t change ever it. With container_of as primitive, the only thing user code has to worry about is whether the pointer indeed points to a field in this struct in the first place.

Sure, leaving out offset_of and container_of may allow you to do less work now, but it is going to constrain you in the long term.

2 Likes

If you are writing an OS kernel, a low-level memory allocator or a wayland compositor, then you would be correct, a primitive like container_of! is not esoteric at all, but for the vast majority of use-cases it definitely is. In fact, I'm having trouble coming with any other use case, at all.

I do agree that primitives like container_of and offset_of should be used in such a scenario, I just don't think these have to be provided by the core or standard library, given that they can also be implemented as regular libraries (with a few yet outstanding stabilizations).

In fact, offset_of would likely be implemented by creating an MaybeUninit instance of the desired type, taking a pointer to the desired field using &raw and calculating the difference between that pointer and the top of the struct. I don't think this would at all be affected by any fundamental changes to how Rust layouts structs. For unsized types, things would likely have to work differently and I suppose you could only make such calculations at runtime.

Overall, I believe that providing the necessary low level operations which absolutely can not be implemented outside the core/standard library and leaving the rest to user crates is more in tune with the philosophy of the Rust standard library and hence more likely to be accepted and stabilized in a finite timeframe.

1 Like

I did not say there shouldn't be such a macro or method in core. Only that it shouldn't be a language primitive. Users reading the code should be able to parse it, and understand how it works and upholds Rust's guarantee, not have it hidden behind an opaque compiler binary and without a chance to learn anything about Rust's object model.

And to be perfectly honest, the fact that you mention this in the same paragraph as 'alternative compiler' and Linux kernel makes me somewhat worried. This is more or less what, to my understanding, Rust is trying to avoid with separating the language, core and std all from each other. While having a Rust kernel module would be awesome I really never want to see a day where I must use a GNUrustc to compile it.

You need a container-of operation for intrusive linked lists, which have all sorts of uses, even if they're most popular in kernels.

But I agree that container-of doesn't make sense as a primitive. Certainly not for thin pointers, where it can be built on top of offset-of, something which is already in heavy demand, not just for this but for other use cases as well. So if anything, offset-of would make a better primitive; but as you said, we don't even need that, since offset-of can be built on top of the upcoming raw pointer syntax.

In theory container-of could be a primitive that's only necessary for fat pointers. But the only reason to do so would be if it left us more room to change fat pointer behavior in the future. Does it? No – at least not compared to a byte-offset method. A byte-offset method does leave a bit more room compared to as_thin_ptr, and I think it would be preferable for that reason.

On the other hand, what I'd like even more would be a more generic API to split an arbitrary pointer into a thin pointer plus 'something else', which could then be recombined. Basically what was proposed in this RFC from 2018, which is technically still open. But that would also be a lot more work, due to the interaction with the type system.

Then that is exactly why you should support having offset_of in the language. So that user code can be written at the appropriate level of abstraction, without being tied to the ABI idiosyncrasies of a particular compiler for a particular platform.

That implementation would be terribly inefficient, because creating a MaybeUninit requires you to allocate memory. Which in this case you wouldn’t actually use, so either you accept the unnecessary cost or pray that the optimiser elides the allocation. But fine, at least it would be correct. In fact, me quote myself on what I really meant by ‘primitive’, because some seem to have missed it:

It may be a black-box built-in, it may be a dangerous forbidden technique, or it may be inefficient pointer arithmetic on a dummy struct: it doesn’t actually matter. Understanding something doesn’t have to mean translating it into assembly in your head every time. It’s kind of impossible already anyway.

But even if it’s technically possible to do in defined-behaviour user code, saying we don’t need offset_of if we have pointer arithmetic is like saying we don’t need while loops because we already have if and goto. You may say Rust has no goto, so this analogy is not particularly fitting – but not having goto is actually very much the point. By providing ‘higher-level’ constructs that are equally expressive for the common case, the language can get away with not providing the ‘lower-level’ construct at all, and even having features that are incompatible with it.

1 Like

It will certainly be elided in release mode. But it should also be possible to const-evaluate it and thereby guarantee nothing happens at runtime even in debug mode. (memoffset already has experimental support for const-evaluating, though it requires some feature flags and I don't know whether it currently uses const evaluation if you write it in a non-const context.)

Also, I think &raw ought to allow simply using it on a null pointer, which avoids the need for a dummy allocation. (In other words, it should just add the offset without asserting anything about the pointer's validity.) Miri currently doesn't allow that, but AFAIK the exact semantics for &raw are not finalized and it might end up being allowed.

Rust is not going to not provide &raw. At most, offset_of could exist as a built-in feature alongside &raw. It probably should eventually, for the sake of better error messages and type checking.

3 Likes

Thinking this through further, I've come to the realization that offset_of and container_of need to be fundamentally different operations when unsized types are involved. For sized types, they can be easily evaluated at compile time, this is not possible for unsized types where it is necessary to retrieve size and alignment information from a concrete fat pointer, at runtime. So there would be need for different primitives like offset_of_unsized and container_of_unsized with different syntax for calling these primitives, although the latter could perhaps still be expressed through the same primitive/macro/whatever with different implementations for sized/unsized types.