Int2ptr and runtime provenance models

I'd like to split this discussion off from [Pre-RFC] usize is not size_t. The current assertion is that integers cannot carry provenance, because it allows incompatible optimizations to take place. However, this prevents rust from being implemented atop of a runtime provenance model where provenance cannot be restored, as such a model requires preservation of provenance in at least the correct destination type for the int2ptr cast (which currently is usize in rust, though may become a uptr or uintptr type) in order for the inverse operation to correctly operate.

I do ask the following, though, of @RalfJung. Is there a program that is accepted (well-formed and has defined behaviour) in a world where rust has provenance in integers, that is not accepted in the current definition? If not, is there another reason that implementing the former as-is (the current definition) would be unacceptable? This would permit such models, as well as optimizers that have solved the substitution problem, to retain provenance, while optimizers that are broken by it to function properly, merely by ignoring the fact such provenance is preserved at a language level.

CC: @jrtc27.

1 Like

Part of the argument is that this problem isn't solvable.

In a deliberately exaggerated example, consider that a program takes a pointer, converts it to an integer, and then displays it to the user somehow. The user then types in the address, the program reads that, parses the number, and converts it back to pointer.

Is the resulting pointer even potentially valid? No matter how much work you put in to maintain provenance through arbitrary arithmetic transformation and piecewise load/stores, it's certainly lost when it round-trips outside the machine into the physical analog world.

Moving towards more reasonable torture cases, consider storing every bit of a pointer scattered throughout memory in different objects. If you bitwise insert bits of your sneaky pointer into another pointer, is provenance tracked at bitwise precision?

As an actually somewhat reasonable example, consider an IPC scheme that uses addresses as tokens representing objects over the process border. Host process takes the address of an object in memory, turns it into an integer, and passes it to the satellite process as an opaque integral token. Later, the satellite process passes back a token, which the host process casts to a pointer. Is it a valid pointer to dereference, if it's the same value?

These are things that people do forms of, and expect to work, in current C. Pointers are "just integers," after all, aren't they? (No, which is the discussion.)


As it stands today, forcing pointers to carry provenance strictly reduces the set of valid optimizations on a program. There is no practical benefit to the language user; fewer programs are valid, and the programs that are valid are slower. So for the time being, having a slightly ugly edge case of "released provenance" for pointers that ptr2int and int2ptr seems the most practical option.

In a systems level language, you can't avoid ptr2int for two main reasons (both of which you can argue as cursed):

  • interfacing with C and/or OS APIs which mix the two, and
  • chunkwise copying of memory necessarily copies pointers as a chunk, rather than as a byte.

It's fine to consider what a language would look like without this escape hatch for pointer provenance, but I don't think any current system can truly manifest it. (Though maybe CHERI will force it to be manifested? Who knows! I'm excited to find out.)

6 Likes

As noted, there is no reason why implementations are forced to carry provenance if the language defines it that way. If they lose meaningful optimizations, then they can simply use the current definition, which I am arguing is a strict superset. The issue with guaranteing the latter is that implementations cannot implement the former, which may be necessary if the runtime provenance model simply cannot handle the no-provenance requirement (perhaps deliberately, as is the case, as far as the arguments I see go, for CHERI).

fn main() {
    let a: i32 = 1; let b: i32 = 2;
    let a_plus_1 = (&a as *const i32).wrapping_add(1) as usize;
    if a_plus_1 == (&b as *const i32 as usize) {
        println!("{}", unsafe { *(a_plus_1 as *const i32) });
    }
}

Assume a happens to be right before b on the stack. If integers have provenance then this is UB; if they don't then it's well-defined.

So no, having provenance is not a strict superset.

I was saying that the current definition is a strict superset, and thus a valid implementation of the provenance definition.

This particular example is especially amusing; in the current provenance proposal for C, casting a_plus_1 back up to a pointer produces a value that has provenance of either a or b; the choice is up to the programmer, so long as they are consistent. In dereferencing it, the programmer has asserted it isn't a one-past-the-end pointer, and thus its provenance is fixed to b forevermore.

(Your example is useful, but this is nonetheless an important corner case within the corner case.)

I don't believe there is a "current" definition, beyond whatever the latest proposal @RalfJung has written down is. Rust just does the unsound "haha this kind of exists" all the other compilers do (mostly due to inheriting it from LLVM). I don't think there is a status quo to measure against.

2 Likes

Yeah, that's what I meant by "current definition". I should have clarified it.

Ah. In other words, seen end-to-end, you're saying that PNVI-ae-udi allows a superset of the programs that PVI does.

I think the only case where that might not be true has to do with sneakily converting a pointer to an integer using some form of type punning, without going through pointer-to-integer instructions. Per previous discussion, it's not clear whether this can be allowed or not without breaking optimizability, especially if you don't have type-based alias analysis to fall back on. Yet there is no reason to disallow it in PVI. I suppose you could take PVI and then artificially disallow such sneaky conversions, in an attempt to guarantee it doesn't allow any programs that are not allowed by PNVI.

However, there are reasons to want to allow programs that are allowed by PNVI but not PVI. For one, it's nice to be able to reassure users that an integer is just an integer and doesn't carry any spooky hidden state around with it.

In addition, contrary to your statement at the beginning of the thread, separating integers from pointers can make it easier to implement CHERI. That's because it allows clearly differentiating pointer operations, which can be guaranteed to preserve runtime provenance, with integer operations, which are more expansive and cannot in general have that guarantee.

For example, breaking a pointer into individual bytes and then reconstituting it is something that compile-time provenance models typically want to allow, in order to support naive implementations of memcpy. But under CHERI this is in general impossible. (If those bytes stay in registers, then it can work, because at least in Morello, both the lower and upper bounds of a capability are tracked separately from the value itself. But if the user sticks the byte in memory somewhere, then there's nowhere to put the bounds.)

So the rule would be: on CHERI, there is no uptr or uintptr. If you want an opaque value that can store either a pointer or integer, use a pointer type. Sure, that breaks compatibility with Rust and C programs that assume that usize or uintptr_t can be used for opaque values. But even in C, many of those programs also assume that pointers and/or uintptr_t can fit in uint64_t, and will break anyway. In Rust, essentially every existing program is broken because they all use usize to store pointers – unless we change usize to 128-bit, but that would have its own problems.

That said, Rust may be forced to allow some subset of ptr->int->ptr if C implementers do, and they may want to allow it because it will make it easier to port some C programs, and because C doesn't have wrapping_add so integers are the only way to do potentially-overflowing pointer arithmetic. So I guess we'll see?

Also, even in Rust, the use case of sticking extra data in pointer low/high bits is easier with integer types – though we could always add convenience methods the standard library to do that with pointer types.

Wouldn't this be solved by the provenance being adjacent in memory to the pointer, or even stored as part of the bytes of the pointer? Stepping away from CHERI in specific, and towards, say, miri's model, copying the bytes of a pointer copies the pointer, because the provenance is encoded directly in the bytes. And using a model involving 2 hardware-level pointers, one to the data, one to the provenance would work if they were adjacent in memory and both part of the pointer data.

Oh hey, [Pre-RFC] usize is not size_t.

Somehow this statement seems to ignore the cases where usize is already not 64 bits, such as RISC-V RV128, MSP430, etc. My understanding is that usize is nominally adequate to hold a CPU virtual address; nothing more. usize was never intended to also carry non-address information such as CHERI capabilities, since there is no inherent size constraint on such address-associated metadata.

So, how it works with CHERI is:

  • The provenance is stored as part of the bytes of the pointer (aka "capability"), making them a total of 128 bits.
  • But there's a secret 129th bit. Not only does every general-purpose register have this bit, used when a capability is stored in a register, RAM itself has one extra bit for every 16 bytes, used when a capability is stored in memory.
  • That bit is the "valid" bit, which determines whether the corresponding register, or the corresponding 16 bytes of RAM, contain a valid capability, as opposed to any other arbitrary data. Doing a load or store from/to a capability will fault unless its valid bit is set.
  • If you load or store a full 128-bit register, from/to a 16-byte aligned address, the valid bit is copied from source to destination. If a load or store is smaller than that, or unaligned, the destination valid bit is simply cleared. This is what ensures that programs cannot forge capabilities no matter what.
  • The valid bit cannot be set/cleared manually, except with a special instruction that only works in kernel mode.

Thus, copying a capability one byte at a time will preserve the pointer and provenance but clear the valid bit, making it unusable. Only copying it as a full 128-bit unit will keep it usable.

In principle it's not that different from Miri, in the sense that user-accessible bytes are augmented with hidden provenance information. But unlike Miri it has to work with reasonable memory overhead, hence these limitations...

7 Likes

So storing a pointer using write_unaligned to unaligned memory or using the safe #[repr(packed)] is invalid? That is even stricter than forbidding int2ptr or bytewise memcpy.

2 Likes

Yep.

This isn't true. CHERI defines no such instruction, it is completely pure with no escape hatches. Arm's Morello does include such a wart in the architecture, but it causes all kinds of issues (e.g. you can have multiple different encodings of the same tagged-valid capability, and tagged-valid capabilities with nonsense encodings, all because you can just set arbitrary bits on memory if suitably privileged and a higher privilege level hasn't disabled that functionality for you which, for Arm, could be EL3, restricting it only to firmware) and I don't think anyone wants it to stay, it's only there because of a concern that it might be needed in order to avoid expensive re-derivation of capabilities when swapping in from disk (but we have alternative instructions specifically for that in CHERI with no such holes).

3 Likes

Yes, whilst we'd like to be able to support that, it's impossible without causing serious performance issues for memcpy. Currently memcpy works because it can blindly copy capability-sized-and-aligned words at a time, without needing to know if there's a valid capability there or not, but if they're unaligned then you don't know where the valid capabilities are (unless you just add a memcpy instruction to your architecture, of course, then you can ignore the problem...). Not to mention that, for security, there's a strict requirement that tags be atomic with the full capability they protect, which would cause serious headaches in microarchitecture for unaligned capabilities that cross cache line boundaries.

Fortunately packed is not standard C, only GNU and Microsoft extensions (and even creating an unaligned pointer is strictly undefined in C). Also, it's rare that packed structures contain real language-level pointers, since normally they're used for some kind of serialised message in a communication protocol, where it makes no sense to put a pointer (and, if people do, it can cause security issues in the case of a malicious other party), at least in our experience (though I do know of some nasty exceptions that need dealing with on a case-by-case basis). I make no claims about Rust though.

3 Likes

This restriction prevents some rust code that doesn't even use unsafe to work like:

#[repr(packed)]
struct PackedRef {
    a: u8,
    b: &'static u8,
}

let a = Box::new(PackedRef {
    a: 0,
    b: &0,
});
let b = Box::new(*a);

as that would require copying an unaligned reference from one heap allocation to another. Ignoring the #[repr(packed)] wouldn't work as that is observable using an offset_of!() macro.

Maybe it is a contrived example, but it means that CHERI can't completely implement the rust abstract machine, independent of the provenance model eventually chosen for rust. #[repr(packed)] is part of the rust specification.

4 Likes

No need to reach for offset_of (especially that the relative order of fields in memory is, if I remember correctly, still undefined with #[repr(packed)] alone); you might as well just point to mem::size_of<PackedRef>() or mem::align_of<PackedRef>().

3 Likes

repr(packed) allows invalid unaligned loads · Issue #27060 · rust-lang/rust · GitHub and Tracking Issue for future-incompatibility warning `unaligned_references` · Issue #82523 · rust-lang/rust · GitHub and Tracking issue for `safe_packed_borrows` compatibility lint · Issue #46043 · rust-lang/rust · GitHub
This seems like essentially the same issue, of creating an unaligned reference to a packed structure. So, if this became actually disallowed, that would then mean that at least this element of Cheri is fine with the Rust model, right?

1 Like

No, those issues talk about a reference which points to a value which is not sufficiently aligned.

The issue with CHERI is specifically loading a pointer which is itself not sufficiently physically aligned itself.

It is perfectly valid Rust to copy a reference into and out of an arbitrary memory location, which does not have to be properly aligned to store a reference. By using #[repr(packed)], this is possible to achieve in safe code.

To be super clear, even the following would be problematic (before mem2reg optimization, anyway):

let x = 0;
let buf = PackedRef {
    a: 0,
    b: &x,
    // ^^ valid copy into unaligned memory
};
let r = buf.b;
// ^^ valid copy out of unaligned memory
*r;
// ^^ valid read of aligned reference

The use of Box only serves to highlight that the values (potentially) aren't just stored in registers and never stored in RAM (cache).

What those issues refer to is &buf.b, which creates an invalid reference.

(The terminology issue is that "unaligned pointer/reference" typically refers to the value of the pointer not being sufficiently aligned for the pointee type. But in this case, it's storage of the pointer itself which is not physically aligned to a sufficient byte boundary.)

3 Likes

Yes, whilst we'd like to be able to support that, it's impossible without causing serious performance issues for memcpy. Currently memcpy works because it can blindly copy capability-sized-and-aligned words at a time, without needing to know if there's a valid capability there or not, but if they're unaligned then you don't know where the valid capabilities are (unless you just add a memcpy instruction to your architecture, of course, then you can ignore the problem...)

....is this why Arm is introducing CPYP / CPYM / CPYE? You don't have to answer that, I just find it... interesting.