Idea / Pre-RFC: Null-free pointers

In Rust, physical address 0x0 is treated as null: an invalid pointer that must never be accessed. However, this is not a universal assumption - more like biased towards the constraints of specific environments.

Treating a particular bit pattern as null contradicts the address space guaranteed by hardware, and this undermines the generality of the language.

Accordingly, I propose the null-free-ptr RFC, which removes the null value from pointers.

Demand and Significance

Robin Mueller, a researcher at the University of Stuttgart's Institute of Space Systems, was developing a Rust bootloader for the Vorago VA108xx and VA416xx - radiation-hardened Cortex-M4 MCUs deployed in aerospace - where programme RAM starts at 0x0. He needed to read the running application image from address 0x0 to flash it to non-volatile memory. Skipping the first few bytes was not an option - the image starts there. Standard Rust pointer operations were impossible; the only path was inline Armยฎ assembly.

I circumvented the issue by falling back to assembler, but this feels really hacky to me.. Shouldn't Rust be low level enough to allow me to deal with these issues? I found this pre-RFC: Pre-RFC: Conditionally-supported volatile access to address 0 - libs - Rust Internals.

Source: Reading from physical address 0x0

This is a real developer, on a real chip, in a real aerospace mission, working around a real limitation.

The pattern generalises; On bare-metal targets, the address of a hardware structure is not always a design choice - it can be a constraint imposed by the hardware or the firmware, with zero way for the Rust programme to negotiate.

Consider a 16-bit target whose device tree is placed at 0x0 by the hardware with 64 kiB of RAM:

// This address is forced by the hardware.
// Rust does not get to choose it.
const BLOB_P: usize = 0;
const _: () = assert!(usize::BITS == 16);

#[unsafe(no_mangle)]
extern "C" fn ignite() -> ! {
    // BLOB can never be read volatilely;
    // There's no available RAM to copy the entire struct.
    let mut blob = unsafe { &mut *(BLOB_P as *mut DevTreeBlob) };
    // instant UB upon reference construction

    let mapping = blob.foo();
    blob.bar |= 0b1;

    ...
}

Even when spare RAM exists, the address may still come from outside the programme - and cannot be controlled:

use core::slice::from_raw_parts as mkslice;

// `map` address is reported by the firmware.
// Rust does not get to choose it.

// Caller ensures there's at least one entry
#[unsafe(no_mangle)]
extern "C" fn spark(map: *const RamLayout, len: NonZeroUsize) -> ! {
    for entry in unsafe { mkslice(map, len.get()).iter() } {
        // instant UB upon calling `from_raw_parts`
        // as `from_raw_parts` constructs `&T`
        ...
    }

    ...
}

Volatile operations cannot help here: they cover individual values, not reference construction, slice creation, or bulk operations like ptr::copy - and even worse in the first example, there is no spare RAM to copy into in the first place.

The only valid workaround for this is inline assembly: which is not for constructing a reference.

Required Changes (The lifeblood of this proposal)

  • Change the set of valid referenceable address values for *const T / *mut T to the full range of usize

  • Remove the invalidity assumption for 0x0 pointers, along with all associated validity checks and optimisations

Downstream considerations (Further opinions are welcome)

  • Change pointer-related assumptions in the Allocator API

    Since the full range of usize can be a valid pointer, Allocators should be able to return 0x0 as a valid address.

  • Reconsider the validity basis of &T / &mut T

    If only the pointer definition is changed, UB would be triggered immediately upon conversion to &T solely due to the all-zero bit-pattern condition.

    Possible directions include:

    • Changing the reference invariants directly
    • Introducing a new vocabulary type in core that encapsulates read_volatile safety semantics with lifetime tracking, enabling 0x0 access without altering &T or breaking niche optimisation
  • Et cetera - if there are further downstream effects, additions are welcome.

Cost

upon acceptance

of Required Changes

Zero. Raw pointers can already have all-zero bit-pattern, carry no niche optimisation, and their in-memory layout (usize) does not change. The only difference is that pointer or additional reference operations (e.g. ptr::copy, ptr::write_bytes, slice::from_raw_parts) cease to treat 0x0 as unconditionally invalid. The LLVM null_pointer_is_valid flag already supports this semantics.

of Downstream Considerations (which are discussed separately)

Should the downstream considerations be accepted, the following costs will be incurred:

  • Loss of niche optimisation opportunities for Option<&T>
  • Breach of the documented layout guarantee for Option<&T>::None
  • Breaking changes to unsafe code and FFIs that depend upon it

Additionally, costs to be weighed according to scope will include:

  • ABI inconsistencies between crates and ecosystem fragmentation [compiler flag / crate feature / etc...]
  • Migration costs imposed on the entire ecosystem [global only]

This is no small cost and must not be overlooked, but at the same time, whether it is important enough to compromise generality and hardware faithfulness also warrants discussion.

upon rejection

Maintaining the status quo also carries its own cost:

  • Inability to use the core::ptr API on valid hardware addresses
  • Continued barrier to Rust adoption on bare-metal hardware
  • Forced bypass of its safety guarantees on affected targets

This cost is already being paid - silently, as Rust shifts from being an alternative to needing one.

Feasibility

On Compiler Optimisations

LLVM has a flag called null_pointer_is_valid, which instructs the compiler to treat the 0x0 pointer as a valid address. This means my proposal does not conflict with LLVM's assumptions.

On the Nature of the Constraint

Not all constraints are equal. for example: Forbidding unaligned access is defensive: the hardware itself may fault when interpreting the instruction itself. Forbidding 0x0 is offensive: no instruction set treats a load from 0x0 as inherently illegal - the fault, if any, comes from the RAM management configuration, not from the instruction being interpreted and executed.

A systems language's constraints should protect the programmer from the hardware instead of attacking them to protect the language's optimisation. The null assumption falls into the latter, and this distinction is central to the motivation of the proposal.

When constructing abstractions, it is desirable to take the greatest common divisor. If counterexamples to a condition exist - and if those counterexamples are numerous - it is inappropriate for that condition to be included in the abstraction. The abstract-machine-level assumption about null pointers has the aforementioned demand (counterexamples), which are not only too numerous to isolate as exceptional configurations but also widespread.

Prior Discussion

This issue has been discussed on several occasions.

Within Rust

  • unsafe-code-guidelines#29 - prior discussion on null pointer semantics, closed by Rust PR #141260

  • rust-lang/rust#141260 - the volatile stopgap described below

  • rust-lang/rfcs#2400 - The Zero Page Optimization RFC, which proposed expanding the null range to the entire zero page, was closed after pushback from embedded developers who pointed out that the zero page, or even zero address contains valid RAM on their hardware.

Outside Rust

  • LWN.net - an article refuting the assumption that a zero pointer is always null

  • MSP430 - a real-world case where standards restricted usage on a specific target

1 Like

If you need to read/write to address 0, then use read_volatile/write_volatile.

They are the defined ways to access memory outside the Rust AM (like address 0). In fact address 0 is specifically called out as valid for these methods.

3 Likes

That's not a solution, but a workaround. read_volatile / write_volatile only covers single-value reads and writes and do not cover ptr::copy, ptr::write_bytes, or construction of &T; all of which are an instant UB for 0x0.

Requiring a special API for every access to a normal address is exactly the kind of limitation the RFC aims to eliminate.

Changing rust to allow &T to point to null is infeasible. This is because thereโ€™s a lot of code out there that uses the type Option<&T>. This type is optimized to use a null pointer to represent the None state. This optimization is probably important for performance of many things. Also, FFI code might also depend on this optimization being done, since it is guaranteed by the documentation.

5 Likes

What you're saying is that we need to sacrifice outbound soundness of Rust AM for optimisation; The faithfulness of Rust AM to actual hardware.

I have no idea what you mean. What is "outbound soundness"? What is "soundness of Rust AM"? What is "faithfulness of Rust AM to actual hardware"?

By 'outbound soundness', I mean the faithfulness of Rust AM's assumptions to the actual hardware. - as opposed to 'inbound soundness' within the AM's own rules. On targets where 0x0 is valid address, Rust AM cannot accurately model the hardware even though it is internally sound.

Why is it infeasible for the RAM management system to use the identity map, but disallow allocations from being placed at address 0x0? If alignment is a concern (having RAM start at address 0x1 might be awkward), maybe disallow allocations from being placed in the first 4096 bytes of memory or similar? I understand that RAM may be quite limited, but is it limited enough that those 4kB are non-negotiable?

On the language side, I do not believe null references will ever be permitted, and existing unsafe code likely makes assumptions about Rust allocations never being placed at 0x0. This is, after all, a stabilized guarantee. However, I do think there should be better support for volatile versions of functions like core::ptr::copy and core::ptr::write_bytes. Perhaps there could also be a bunch of functions that allow the given pointer to be null without having volatile semantics. (The problem is that it results in a proliferation of function variants, which feels less than ideal. We may also want volatile atomic accesses, so now we need to duplicate all those APIs; we may want atomic null accesses, so yet more; and so on.)

3 Likes

Rust never claims that pure Rust programs can do everything that a valid computer program running on a concrete machine can do, so this does not feel like a soundness hole. It's much like the borrow checker; the borrow checker accepting code is significant, but the borrow checker rejecting code means the code may or may not be sound, which doesn't say much.

There's definitely some better way to express your message than the word "soundness". "Rust doesn't faithfully represent hardware" is a much more clear stance, I think. (And then we can discuss the extent to which that is a problem.)

4 Likes

On most physical hardware: a) 0x0 is a valid address, but b) 0x0 is special in some way that means that general-purpose memory operations there wouldn't really make sense. (For example, it's often part of an interrupt vector โ€“ and if you ever place an incorrect value into an interrupt vector, even for a split second, the processor could potentially end up executing from an arbitrary address in kernel mode, which is so close to "anything could happen" undefined behavior that the distinction doesn't really matter.)

So even if you're doing the sort of low-level programming where you might validly want to touch address 0, you wouldn't be using it as a general-purpose addess anyway โ€“ your program should generally always be aware of whether it's accessing address 0 or not. And if it is, just do a volatile write or read, which are allowed on address 0. (It probably has to be volatile anyway, and possibly even atomic, in order to prevent whatever special meaning is assigned to address 0 from violating assumptions made by the compiler.)

3 Likes

First: If one target has kibibytes of RAM which starts from 0x0 without paging support, Rust isn't avaliable for the hardware. and that is EXACTLY the problem.

Second: That's the point. API proliferation should be avoided.

Third: null-ptr invalidity can NEVER be avoided; even with unsafe closure, we have no escape.

and about the term "outbound soundness": Fair point, I should have chosen faithfulness instead of soundness.

I've mentioned that 0x0 could be Non-MMIO RAM address, not every hardware assumes 0x0 is reserved for special usage; even on AMD64.

moreover, volatile R/W can never be a solution. It's a workaround.

Out of all the processors I know, the one on which 0x0 is least special is the 6502. On the 6502, there are addressing modes that use 8-bit addresses, thus can access only memory in the 0x0000-0x00FF range โ€“ machine code instructions using these addressing modes run 1 cycle faster than their "anywhere in memory" alternatives (and, if you specify the address literally in the instruction, are 1 byte shorter). 0x0000-0x00FF otherwise aren't special, e.g. you can access them via normal pointers.

What happens in practice on 6502? Almost everyone uses the addresses in question only for important global variables and for register spills โ€“ they're too valuable to, e.g., allow a general-purpose memory allocator to allocate them. It would be rare to have a pointer pointing at them โ€“ maybe at a few, but you would know which ones you expected to ever be pointable and which ones you wouldn't. Say you decide to reserve 0x0000 as a spill slot โ€“ now you know there'll never be a pointer to it (because you can't form a pointer to a register and a spill slot acts like a register), so the fact that the hardware is physically capable of forming a pointer there doesn't matter.

Now say you're writing a Rust program for some other piece of hardware where address 0 isn't special. There are two cases:

  • If the program isn't low-level enough to be doing all the allocation, memory-mapping, etc., itself, then it wouldn't be using address 0 because allocators don't allocate it and a higher-level program like this wouldn't be able to assume that it's valid.
  • If the program is low-level enough to be doing all the allocation and memory-mapping itself, then it gets to choose its own memory map. You can just choose a memory map where you never need to create a pointer to address 0 (thus all accesses to it are from parts of the code that are aware that they're accessing address 0) and use volatile reads/writes on those instructions.

(And in case you're worrying about a potential loss of efficiency from "an allocator never allocates address 0" โ€“ all allocators have some amount of internal state that they can't allocate because they're using them themself, so if address 0 truly is non-special, you can just have the allocator use it to store its own state, and then no memory is wasted.)

I think the problem is that you're assuming "because address 0 isn't special in the hardware that means that software needs to be able to use it as a general-purpose address". But the conclusion doesn't follow from the premise: even if the hardware doesn't treat address 0 specially, all software will have at least some addresses that it treats specially (allocator internal state, stack, register spills, certain sorts of special global variables like the TLS root, etc.), and it can't use any of those addresses as general-purpose addresses. So there are always going to be some addresses that you can't form pointers to or use with the normal std::mem functions. If you have control over your own memory map, just place one of those special addresses at address 0 โ€“ and if you don't, you can't use address 0 anyway. In neither case does Rust need to be able to support general pointers to address 0.

(A good comparison: the x86 series of processors has both physical and virtual addresses, and although physical address 0x0 is special, virtual address 0x0 isn't. Nonetheless, all sensible 32-bit and 64-bit operating systems design their memory map in such a way that the address in question is always unmapped โ€“ there are plenty of other places to put things and being able to catch null pointer dereferences is just too useful (in particular, the kernel cares about any potential null pointer dereferences in the kernel hitting a mapping error rather than potentially running non-kernel code). 16-bit operating systems care about space to the extent that they would consider using it, but normally virtual address 0 is used to hold things like system call interfaces that need to be accessible in memory but which aren't available for general-purpose use, meaning that 0 can still be used as a null pointer because it would never be used as a general-purpose memory address.)

7 Likes

Good point - in my case, Identity map was a software architecture. Yes, I could choose HHDM or something similar. But my strongest argument here is a hardware whose RAM layout is a constraint, not a configuration. Arm Cortex-M with 2 kiB of RAM and no MMU have no RAM layout to "choose". Telling those users to "just put something special at 0x0" is effectively telling them Rust won't support their hardware.

You still need to choose the RAM layout when programming for such processors! In fact, it's probably even more important, to be able to fit everything in.

In particular, the toolchain needs to know what the memory layout is in order to be able to generate code correctly. (Rust can't "automatically" set up the address space on bare-metal systems โ€“ you need some sort of entry point code that at a minimum sets the stack pointer correctly.) Normally this is done using linker scripts, which allow you to outright specify things like what addresses are used for the stack, and put your startup code at the correct hardware address so that the hardware knows how to run it.

On such systems, you probably wouldn't be using a memory allocator at all (it's possible to write one but usually not worth it), and as such you know that pointers will only ever exist to global variables or the stack. So it's usually very easy to set up your conventions for using address space in such a way that you never form a pointer to address 0. (And as I said above, if you are using an allocator, you can just place its internal state at address 0.)

So in a sense, Rust doesn't support this sort of hardware "out of the box" โ€“ you need some sort of assembly-code wrapper and a linker script in order to correctly interface your software with the hardware. (You need that in other compiled languages, too, unless the compiler special-cases the specific model of hardware you're targeting.) But when you're doing that, you can make address 0 appropriately special at the same time.

4 Likes

It's "easy" when you have mebibytes of RAM. On embedded hardware like Cortex-M series, every single byte counts. Avoiding 0x0 itself is a cost.

Moreover, C's null pointer representation is implementation-defined. A freestanding C compiler can treat 0x0 as a valid address on targets where it is one. Rust cannot, because the invalidity of 0x0 is hardcoded into the language semantics, not even the target.

you can just place its internal state at address 0.

You'll need to construct a reference to access it with any benefit of Rust's type system, and that's instant UB. Even raw pointer operations like ptr::copy and ptr::write_bytes are UB on 0x0. There is no escape.

Cortex-M a) is 32-bit and b) maps 0x0 to the code memory that is used to boot the system โ€“ you aren't going to want to form a reference to that (what would such a reference be useful for?).

You are pointing out a deficiency in Rust, but the deficiency in question is "if you program an entire software stack entirely in Rust, including the kernel and bootloader, there are some points at which you will have to use raw pointers and volatile accesses rather than regular Rust references". I'm one of the biggest advocates of "let's make safe Rust more powerful so that we can use normal type-checked references rather than needing to unsafe code", but in this case unsafe code is inherently going to be involved โ€“ you are configuring a processor, doing that is inherently expected to have weird side effects if done incorrectly, it is very difficult for a computer to know which configurations are sound and which are unsound, thus you will need to write it unsafely. If you think about it, you wouldn't be getting any useful type-system guarantees anyway from having a reference to an important global variable in the kernel or an important memory-mapped hardware address โ€“ if you set it to the wrong value, things still go wrong, even if the provenance and lifetime and type are all correct.

And if you're going to need to do something unsafe to correctly set up the hardware (which you are), needing to do something unsafe to access address 0 is not a big burden on top of that โ€“ a sensible programmer doesn't use address 0 for general-purpose use anyway because there would be a huge performance loss from doing so (in that it would make null-checks everywhere in the program slower) and because there's no real cost to using it for special-purpose use instead. If you're setting up bare metal from scratch, there are almost necessarily going to be some things you can only do using unsafe code, so whatever you use address 0 for, you may as well make it part of those.

(Note that through all this discussion, I've been assuming that either you have a system with separate code ROM and data RAM or that something about the system inherently prevents you putting code at address 0. If the code and data are in the same address space, you can just make 0 a code address that isn't a function entry point, and then you would never need to form a reference to it or read/write it.)

4 Likes

The scenario you describe - 0x0 as a special-purpose address - is what the Demand section already refuted. My RFC concerns environments where 0x0 is ordinary RAM and every single byte counts, not an interrupt vector or boot ROM.

and your solution to put the code at address 0x0 is exactly "shifting the language's responsibility onto the user".

This is what you wrote in the Demand section:

This doesn't list any platforms on which being able to form a reference to address 0 would be useful:

  • Your identity-mapped operating system is, presumably, going to be using at least one byte of memory for a global variable or for its own code. Being an operating system and thus able to design its own ABI conventions, it is trivial to put something at address 0 that does not ever need a reference to be formed to it โ€“ in practice, all operating systems will use at least some global variables that never move in memory and thus prevent their addresses being used by anything else. (And on many processors, physical address 0 is special to the processor, so if you use an identity map, virtual address 0 will be special to the processor too and thus you would be forced to use it for whatever purpose the processor uses it for, and forming a reference there could be very dangerous.)
  • ARM Cortex-M maps the bootloader's code in the segment starting at address 0, and it gets automatically mapped there by the hardware on startup (from a source that's configured using physical wiring because it can't be configured by the bootloader for obvious reasons). There is no purpose to having a reference to the bytes of code that make up the bootloader โ€“ reading it is almost useless and writing it would effectively be a firmware reflash, a very unsafe operation. So on Cortex-M, you are not forming a reference to that address, ever.
  • I've worked with embedded systems with less than 200 bytes of RAM (plus a few kilobytes of ROM). On the system I worked with, address 0 was special (theoretically a valid address, but pointers to it didn't work because it was memory-mapped over part of the logic for dereferencing pointers). If it hadn't been special in that way, it wouldn't have mattered โ€“ with RAM that small you are storing everything you can in global variables so that your code can use hardcoded addresses rather than paying the cost of storing the addresses, so I would just have put a global variable there (that was accessed hardcodedly rather than through references) in order to still have a valid null pointer.
  • On bare-metal environments you don't have an OS or runtime support libraries, so you will have to create an equivalent to be able to run code. This cannot by definition be provided by the language โ€“ because it if were, it wouldn't be a bare-metal environment any more. Your runtime support library will have to use at least one byte of memory, so you may as well make 0x0 part of that.
  • On AMD64, 0x0 is indeed a valid physical address. Doing a non-atomic write to it would inherently cause a race condition (because the processor can atomically read it at any point, if it needs to service an interrupt). Also, by the nature of the hardware special case for it, it is likely to be written once by the bootloader, once by the kernel, and then never again. Needing to use a special instruction for the purpose doesn't seem like a burden here โ€“ and again, you are never going to want to form a reference to it.

I think your Demand section is undermining your own request โ€“ you've basically made a request to allow references to address 0, and then given a list of circumstances in which forming such a reference would either be useless or actively harmful. Some situations are inherently doing something unsafe, and as far as I can tell, every situation in which you might want to access address 0 is either inherently doing something unsafe, or else a situation in which you can easily make decisions in such a way that you are accessing some other address instead. Your list of examples is not refuting that at all.

If you want to push this further, I would suggest that you specify a concrete, specific processor and scenario for using that processor for which the following statements hold:

  • it is possible to access address 0;
  • the processor does not give address 0 a special meaning that would make it inherently unsound to give safe code arbitrary unchecked access to read/write it;
  • in the scenario in question, the programmer does not have enough control over the memory map to be able to put something at address 0 that you would never want to form a reference to (like variables belonging to runtime support code or code for startup routines), and the memory map that the programmer is forced to use doesn't naturally put such a thing at address 0 either.

Also state the assumptions you have about a) how programs are loaded and run in this scenario, and b) who or what is responsible for specifying the memory map while the program runs (and in particular how much ability the programmer would have to change it).

So far, you haven't given any such examples โ€“ you've just been repeatedly stating without proof that they exist (and referred me to a section containing a number of incorrect examples), which is not very convincing. If you want an RFC to succeed, your motivation section will need to contain at least one correct example, and probably more โ€“ otherwise it will look unmotivated.

5 Likes

I have two.

First, RISC-V defines no standard RAM layout which means it's entirely implementation-defined. A conforming RISC-V implementation can place RAM at 0x0, and such implementations exist (e.g. the Micro RISC-V core, 0x0โ€“0x7fb). Whether a particular implementation is a teaching core or a commercial SoC is irrelevant - the point is that the ISA permits it, and Rust should not prohibit what the architecture allows.

Second, the BCM2835 (Raspberry Pi 1, ARM1176JZF-S, tens of millions shipped) maps physical RAM starting at 0x0 per the official Broadcom datasheet. Yes, it has an MMU, but an identity map is a legitimate bare-metal design choice, not a workaround. And requiring every OS developer to remap around a language-level assumption is precisely "shifting the language's responsibility onto the user."

If Rust intends to be a general-purpose systems language across these targets, "sensibility" must never be assumed.