Idea / Pre-RFC: Null-free pointer and Zeroable reference

In Rust, physical address 0x0 is treated as null: an invalid pointer that must never be accessed. However, this is not a universal assumption - more like biased towards the constraints of specific environments.

Treating a particular bit pattern as null contradicts the address space guaranteed by hardware, and this undermines the generality of the language.

Accordingly, I propose the null-free-ptr and zeroable-ref RFC, which eliminates the concept of null from the pointer model and introduces a new reference with extended semantics to the full address space.

Before reading

This proposal SHALL NOT change:

  • Semantics, layouts, and niche optimisations of &T, Option<&T>, and NonNull<T>
  • Validity invariant of non-zero reference primitive - &T and &mut T (0x0 remains null for them)
  • All existing safe and unsafe APIs that do not access 0x0
  • ABIs

Need and significance

Robin Mueller, a researcher at the University of Stuttgart's Institute of Space Systems, was developing a Rust bootloader for the Vorago VA108xx and VA416xx - radiation-hardened Cortex-M4 MCUs deployed in aerospace - where programme RAM starts at 0x0. He needed to read the running application image from address 0x0 to flash it to non-volatile memory. Skipping the first few bytes was not an option - the image starts there. Standard Rust pointer operations were impossible; the only path was inline Armยฎ assembly.

I circumvented the issue by falling back to assembler, but this feels really hacky to me.. Shouldn't Rust be low level enough to allow me to deal with these issues? I found this pre-RFC: Pre-RFC: Conditionally-supported volatile access to address 0 - libs - Rust Internals.

This is a real developer, on a real chip, in a real aerospace mission, working around a real limitation.

The limitation is not unique to that mission. The W65C02S - an 8-bit processor whose 16-bit address bus spans the entire 64 kiB address space from 0x0000 to 0xffff (datasheet pp. 5, 15) - leaves no room for a sentinel. There is not a single byte to spare or sacrifice. On such a target, "just avoid address zero" is not a workaround; it is a physical impossibility.

The pattern generalises; On bare-metal targets, the address of a hardware structure is not always a design choice - it can be a constraint imposed by the hardware or the firmware, with zero way for the Rust programme to negotiate.

Consider a 16-bit target whose device tree is placed at 0x0 by the hardware with 64 kiB of RAM:

// This address is forced by the hardware.
// Rust does not get to choose it.
const BLOB_P: usize = 0;
const _: () = assert!(usize::BITS == 16);

#[unsafe(no_mangle)]
extern "C" fn ignite() -> ! {
    // BLOB can never be read volatilely;
    // There's no available RAM to copy the entire struct.
    let mut blob = unsafe { &mut *(BLOB_P as *mut DevTreeBlob) };
    // instant UB upon reference construction

    let mapping = blob.foo();
    blob.bar |= 0b1;

    ...
}

Even when spare RAM exists, the address may still come from outside the programme - and cannot be controlled:

use core::slice::from_raw_parts as mkslice;

// `map` address is reported by the firmware.
// Rust does not get to choose it.

// Caller ensures there's at least one entry
#[unsafe(no_mangle)]
extern "C" fn spark(map: *const RamLayout, len: NonZeroUsize) -> ! {
    for entry in unsafe { mkslice(map, len.get()).iter() } {
        // instant UB upon calling `from_raw_parts`
        // as `from_raw_parts` constructs `&T`
        ...
    }

    ...
}

Volatile operations cannot help here: they cover individual values, not reference construction, slice creation, or bulk operations like ptr::copy - and even worse in the first example, there is no spare RAM to copy into in the first place.

The only valid workaround for this is inline assembly: which is not for constructing a reference.

Changes

Phase I (MUST be applied; MAY be done immediately)

  • Making address 0x0 a valid address to dereference (e.g. *ptr = val;) and use core::ptr APIs

Migration cost

Activation of LLVM `null_pointer_is_valid` flag (performance impact measured)
C benchmark suite & diff
// bench.c

// clang -v | Homebrew clang version 21.1.8
// clang --target=aarch64-unknown-none -O2 -S bench.c -o baseline.s
// clang --target=aarch64-unknown-none -O2 -fno-delete-null-pointer-checks -S bench.c -o zeroisvalid.s
// diff baseline.s zeroisvalid.s

#include <stdint.h>
#include <stddef.h>

// C1: null check elimination after dereference
uint32_t check_after_deref(const uint32_t *p) {
    uint32_t v = *p; // accessed!
    if (p) return v + 1; // validation after access
    return 0; // baseline: this branch is eliminated
}

// C2: two paths merging on null knowledge
uint32_t branch_after_store(uint32_t *p, uint32_t val) {
    *p = val; // accessed!
    if (p) return *p + 1; // validation after access
    return 42; // baseline: this branch is eliminated
}

// C3: loop with pointer increment
uint32_t sum_until_null(const uint32_t *const *ptrs) {
    uint32_t sum = 0;
    while (*ptrs) { // validation before access
        // null-terminated array of pointers
        sum += **(ptrs++); // accessed!
    }
    return sum;
}

// C4: devirtualisation / inlining based on nonnull
void copy_if_valid(uint32_t *dst, const uint32_t *src, size_t n) {
    if (dst && src) { // validation before access
        for (size_t i = 0; i < n; i++)
            dst[i] = src[i]; // accessed!
    }
}

// C5: struct access implying nonnull
uint32_t read_two_fields(const struct { uint32_t a; uint32_t b; } *s) {
    uint32_t x = s->a; // accessed!
    if (s) return x + s->b; // validation after access
    return 0; // baseline: this branch is eliminated
}
8a9,10
>       cbz     x0, .LBB0_2
> // %bb.1:
10a13
> .LBB0_2:
21a25,26
>       mov     w9, #42                         // =0x2a
>       cmp     x0, #0
23c28
<       add     w0, w1, #1
---
>       csinc   w0, w9, w1, eq
111a117,118
>       cbz     x0, .LBB4_2
> // %bb.1:
113a121
> .LBB4_2:
Rust PoC & benchmark suite & diff

Proof-of-Concept code here

//! bench.rs

//! ./x setup
//! ./x build --stage 2
//! ./x build library --target aarch64-unknown-none --stage 2
//! ln -s ./build/aarch64-apple-darwin/stage2/bin/rustc ./rustc
//!
//! rustc -V | rustc 1.93.1 (01f6ddf75 2026-02-11)
//! ./rustc -V | rustc 1.96.0-dev
//!
//! rustc --target aarch64-unknown-none -C opt-level=2 --emit asm -o baseline.s bench.rs
//! ./rustc --target aarch64-unknown-none -C opt-level=2 --emit asm -o zeroisvalid.s bench.rs
//! diff baseline.s zeroisvalid.s

#![no_std]
#![no_main]

/// R1: null check elimination after dereference
#[unsafe(no_mangle)]
extern "C" fn check_after_deref(p: *const u32) -> u32 {
    let v = unsafe { *p }; // accessed!
    if p as usize != 0 { // validation after access
        v + 1
    } else {
        0 // baseline: this branch is eliminated
    }
}

/// R2: two paths merging on null knowledge
#[unsafe(no_mangle)]
extern "C" fn branch_after_store(p: *mut u32, val: u32) -> u32 {
    unsafe { *p = val }; // accessed!
    if p as usize != 0 { // validation after access
        unsafe { *p + 1 }
    } else {
        42 // baseline: this branch is eliminated
    }
}

/// R3: loop with pointer increment
#[unsafe(no_mangle)]
extern "C" fn sum_until_null(mut ptrs: *const *const u32) -> u32 {
    let mut sum = 0u32;
    while unsafe { *ptrs as usize != 0 } { // validation before access
        // null-terminated array of pointers
        sum += unsafe { **ptrs }; // accessed!
        ptrs = unsafe { ptrs.add(1) };
    }
    sum
}

/// R4: devirtualisation / inlining based on nonnull
#[unsafe(no_mangle)]
extern "C" fn copy_if_valid(dst: *mut u32, src: *const u32, n: usize) {
    if dst as usize != 0 && src as usize != 0 { // validation before access
        for i in 0..n {
            unsafe { *dst.add(i) = *src.add(i) }; // accessed!
        }
    }
}

#[repr(C)]
struct TwoFields {
    a: u32,
    b: u32,
}

/// R5: struct access implying nonnull
#[unsafe(no_mangle)]
extern "C" fn read_two_fields(s: *const TwoFields) -> u32 {
    let x = unsafe { (*s).a }; // accessed!
    if s as usize != 0 { // validation after access
        x + unsafe { (*s).b }
    } else {
        0 // baseline: this branch is eliminated
    }
}

/// mock-up panic handler for no_std
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! { loop {} }

1,4c1,4
< 	.file	"bench.e2ea9eb9f9388208-cgu.0"
< 	.section	.text._RNvCsdBezzDwma51_7___rustc17rust_begin_unwind,"ax",@progbits
< 	.hidden	_RNvCsdBezzDwma51_7___rustc17rust_begin_unwind
< 	.globl	_RNvCsdBezzDwma51_7___rustc17rust_begin_unwind
---
> 	.file	"bench.c8fcca67f257a880-cgu.0"
> 	.section	.text._RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind,"ax",@progbits
> 	.hidden	_RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind
> 	.globl	_RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind
6,7c6,7
< 	.type	_RNvCsdBezzDwma51_7___rustc17rust_begin_unwind,@function
< _RNvCsdBezzDwma51_7___rustc17rust_begin_unwind:
---
> 	.type	_RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind,@function
> _RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind:
12c12
< 	.size	_RNvCsdBezzDwma51_7___rustc17rust_begin_unwind, .Lfunc_end0-_RNvCsdBezzDwma51_7___rustc17rust_begin_unwind
---
> 	.size	_RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind, .Lfunc_end0-_RNvCs3VXMJi7HSJc_7___rustc17rust_begin_unwind
20a21,22
> 	mov	w9, #42
> 	cmp	x0, #0
22c24
< 	add	w0, w1, #1
---
> 	csinc	w0, w9, w1, eq
34a37
> 	cbz	x0, .LBB2_2
36a40
> .LBB2_2:
62c66
< 	mov	x11, x8
---
> 	and	x11, x2, #0xfffffffffffffff8
124a129
> 	cbz	x0, .LBB4_2
126a132
> .LBB4_2:
156c162
< 	.ident	"rustc version 1.93.1 (01f6ddf75 2026-02-11)"
---
> 	.ident	"rustc version 1.96.0-dev"

The result: codegen differences appear only where the compiler would eliminate null-check-not-gating-accesses: 2-3 additional instructions per site. Null-check-gating-accesses (C3, C4) produce identical output.

I think the default assumption should be that the cost is manageable unless demonstrated otherwise - not assumed prohibitive without evidence to match.

  • Divergence between *const T and &T validity at zero: As seen in rust#138351, the divergence is worsened by Phase I alone; however, this is not resolved by forbidding access to 0x0 either - it requires a primitive that can bridge the gap. This is where Phase II comes in to fix it.

Phase II (MUST be applied after Phase I; SHOULD be at an edition boundary)

  • Introducing a zeroable reference primitive (e.g. @T or &zref T) as an extension of &T (targeting self-hosted bare-metal environments)
  • Adding APIs returning zeroable reference primitive (e.g. slice::zeroable::from_raw_parts)
  • Adding conversion methods between zeroable & non-zero reference primitive

Closed questions

  • Zeroable reference primitive clones the implementation of &T - it shares lifetime semantics, borrowing rules, etc.
  • Conversion from &T to zeroable reference primitive is always safe(dropping an invariant); the reverse requires a runtime check or unsafe.

Open questions

  • Syntax: Should it be a sigil, or a qualifier?

    • Syntax::Sigil => something like @mut T or %mut T?

      • No keyword reservation needed; compact
      • Mirrors &T / &mut T structure exactly
    • Syntax::Qualifier => something like &zref mut T or &zeroable mut T?

      • Consistent with &raw const T/&raw mut T precedent
      • Requires one keyword reservation
      • zero is commonly used as an identifier; nullable implies null semantics this proposal explicitly avoids
  • Deref trait: Should Deref trait return zeroable reference primitive, or add a new trait like ZeroDeref, or else?

  • Conversion syntax: Implicit coercion from non-zero reference primitive, explicit .into(), or a dedicated method?

  • Slice integration: Shape of slice::zeroable::from_raw_parts and related APIs.

Considerations (MAY be applied)

  • Non-zero reference primitive could eventually be considered as a non-zero counterpart of a zeroable reference primitive
  • The methodology described at Changes - Phase II may generalise to function pointers where Armยฎ Thumb interworking can produce a valid odd address of a function

Feasibility

On the claimed optimisation cost

The Linux kernel is built with -fno-delete-null-pointer-checks, which is the GCC/Clang equivalent of LLVM's null_pointer_is_valid. One of the most performance-sensitive C codebases in existence has operated without this optimisation for over a decade - originally adopted after a null-pointer-check removal led to a privilege escalation exploit in 2009(CVE-2009-1897).

Benchmark at Changes - Phase I - Migration cost section

On the nature of the constraint

Not all constraints are equal. for example: Forbidding unaligned access is defensive: the hardware itself may fault when interpreting the instruction itself. Forbidding 0x0 is offensive: no instruction set treats a load from 0x0 as inherently illegal - the fault, if any, comes from the RAM management configuration, not from the instruction being interpreted and executed.

A systems language's constraints should protect the programmer from the hardware instead of attacking them to protect the language's optimisation. The null assumption falls into the latter, and this distinction is central to the motivation of the proposal.

When constructing abstractions, it is desirable to take the greatest common divisor. If counterexamples to a condition exist - and if those counterexamples are fatal - it is inappropriate for that condition to be included in the abstraction. The abstract-machine-level assumption about null pointers has the aforementioned counterexamples, which are not only too numerous to isolate as exceptional configurations but also widespread.

Cost of the status quo

Maintaining the status quo also carries its own cost:

  • Inability to use the core::ptr API on valid hardware addresses
  • Continued barrier to Rust adoption on bare-metal hardware
  • Forced bypass of its safety guarantees on affected targets
  • Bypassed safety guarantee of the platform itself
  • Audit failure on mission-critical environments

This cost is already being paid - silently, as Rust shifts from being an alternative to needing one.

Alternatives?

In bare-metal environments, the proposed workarounds - read_volatile, wrapper crates, and extra unsafe abstractions - may not be applicable at all (refer to Need and significance section).

In mission-critical environments, they are worse than inapplicable - they are disqualifying:

  • MISRA-C:2023 Dir 4.3 (Required): Assembly language shall be encapsulated and isolated - inline assembly interleaved with application logic is a direct violation.
  • IEC 61508-7 Table C.1 (SIL 2+): C is only Highly Recommended when used with a defined language subset, a coding standard, and static analysis - workarounds outside the language's defined semantics break this chain.
  • IEC 61508-3 Table A.4: Use of a language subset is Highly Recommended at SIL 2+; escaping the subset requires documented justification per safety case.
  • DO-178C Sect. 6.4.4.2.b (DAL A): Any code not directly traceable from source to object requires additional verification - inline assembly disrupts source-to-object traceability and MC/DC structural coverage.
  • ISO 26262-6 Sect. 5.4.3, Table 1: Use of a language subset is recommended across all ASILs, with MISRA-C cited as the example - assembly workarounds fall outside this subset.

A workaround is not an alternative when the workaround itself is the defect.

Prior discussion

This issue has been discussed on several occasions.

Within Rust

  • rust#138351 - soundness hole where ptr::replace creates &mut *dst internally, causing raw pointer validity and reference validity to diverge at null; a zeroable reference primitive would structurally resolve this class of issue.
  • rust#141260 - the volatile stopgap described above
  • unsafe-code-guidelines#29 - prior discussion on null pointer semantics, closed by Rust PR #141260
  • rfcs#2400 - The Zero Page Optimization RFC, which proposed expanding the null range to the entire zero page, was closed after pushback from embedded developers who pointed out that the zero page, or even 0x0 contains valid RAM on their hardware.

Outside Rust

3 Likes

If you need to read/write to address 0, then use read_volatile/write_volatile.

They are the defined ways to access memory outside the Rust AM (like address 0). In fact address 0 is specifically called out as valid for these methods.

8 Likes

That's not a solution, but a workaround. read_volatile / write_volatile only covers single-value reads and writes and do not cover ptr::copy, ptr::write_bytes, or construction of &T; all of which are an instant UB for 0x0.

Requiring a special API for every access to a normal address is exactly the kind of limitation the RFC aims to eliminate.

Changing rust to allow &T to point to null is infeasible. This is because thereโ€™s a lot of code out there that uses the type Option<&T>. This type is optimized to use a null pointer to represent the None state. This optimization is probably important for performance of many things. Also, FFI code might also depend on this optimization being done, since it is guaranteed by the documentation.

11 Likes

What you're saying is that we need to sacrifice outbound soundness of Rust AM for optimisation; The faithfulness of Rust AM to actual hardware.

I have no idea what you mean. What is "outbound soundness"? What is "soundness of Rust AM"? What is "faithfulness of Rust AM to actual hardware"?

1 Like

By 'outbound soundness', I mean the faithfulness of Rust AM's assumptions to the actual hardware. - as opposed to 'inbound soundness' within the AM's own rules. On targets where 0x0 is valid address, Rust AM cannot accurately model the hardware even though it is internally sound.

Why is it infeasible for the RAM management system to use the identity map, but disallow allocations from being placed at address 0x0? If alignment is a concern (having RAM start at address 0x1 might be awkward), maybe disallow allocations from being placed in the first 4096 bytes of memory or similar? I understand that RAM may be quite limited, but is it limited enough that those 4kB are non-negotiable?

On the language side, I do not believe null references will ever be permitted, and existing unsafe code likely makes assumptions about Rust allocations never being placed at 0x0. This is, after all, a stabilized guarantee. However, I do think there should be better support for volatile versions of functions like core::ptr::copy and core::ptr::write_bytes. Perhaps there could also be a bunch of functions that allow the given pointer to be null without having volatile semantics. (The problem is that it results in a proliferation of function variants, which feels less than ideal. We may also want volatile atomic accesses, so now we need to duplicate all those APIs; we may want atomic null accesses, so yet more; and so on.)

8 Likes

Rust never claims that pure Rust programs can do everything that a valid computer program running on a concrete machine can do, so this does not feel like a soundness hole. It's much like the borrow checker; the borrow checker accepting code is significant, but the borrow checker rejecting code means the code may or may not be sound, which doesn't say much.

There's definitely some better way to express your message than the word "soundness". "Rust doesn't faithfully represent hardware" is a much more clear stance, I think. (And then we can discuss the extent to which that is a problem.)

8 Likes

On most physical hardware: a) 0x0 is a valid address, but b) 0x0 is special in some way that means that general-purpose memory operations there wouldn't really make sense. (For example, it's often part of an interrupt vector โ€“ and if you ever place an incorrect value into an interrupt vector, even for a split second, the processor could potentially end up executing from an arbitrary address in kernel mode, which is so close to "anything could happen" undefined behavior that the distinction doesn't really matter.)

So even if you're doing the sort of low-level programming where you might validly want to touch address 0, you wouldn't be using it as a general-purpose addess anyway โ€“ your program should generally always be aware of whether it's accessing address 0 or not. And if it is, just do a volatile write or read, which are allowed on address 0. (It probably has to be volatile anyway, and possibly even atomic, in order to prevent whatever special meaning is assigned to address 0 from violating assumptions made by the compiler.)

8 Likes

First: If one target has kibibytes of RAM which starts from 0x0 without paging support, Rust isn't avaliable for the hardware. and that is EXACTLY the problem.

Second: That's the point. API proliferation should be avoided.

Third: null-ptr invalidity can NEVER be avoided; even with unsafe closure, we have no escape.

and about the term "outbound soundness": Fair point, I should have chosen faithfulness instead of soundness.

I've mentioned that 0x0 could be Non-MMIO RAM address, not every hardware assumes 0x0 is reserved for special usage; even on AMD64.

moreover, volatile R/W can never be a solution. It's a workaround.

Out of all the processors I know, the one on which 0x0 is least special is the 6502. On the 6502, there are addressing modes that use 8-bit addresses, thus can access only memory in the 0x0000-0x00FF range โ€“ machine code instructions using these addressing modes run 1 cycle faster than their "anywhere in memory" alternatives (and, if you specify the address literally in the instruction, are 1 byte shorter). 0x0000-0x00FF otherwise aren't special, e.g. you can access them via normal pointers.

What happens in practice on 6502? Almost everyone uses the addresses in question only for important global variables and for register spills โ€“ they're too valuable to, e.g., allow a general-purpose memory allocator to allocate them. It would be rare to have a pointer pointing at them โ€“ maybe at a few, but you would know which ones you expected to ever be pointable and which ones you wouldn't. Say you decide to reserve 0x0000 as a spill slot โ€“ now you know there'll never be a pointer to it (because you can't form a pointer to a register and a spill slot acts like a register), so the fact that the hardware is physically capable of forming a pointer there doesn't matter.

Now say you're writing a Rust program for some other piece of hardware where address 0 isn't special. There are two cases:

  • If the program isn't low-level enough to be doing all the allocation, memory-mapping, etc., itself, then it wouldn't be using address 0 because allocators don't allocate it and a higher-level program like this wouldn't be able to assume that it's valid.
  • If the program is low-level enough to be doing all the allocation and memory-mapping itself, then it gets to choose its own memory map. You can just choose a memory map where you never need to create a pointer to address 0 (thus all accesses to it are from parts of the code that are aware that they're accessing address 0) and use volatile reads/writes on those instructions.

(And in case you're worrying about a potential loss of efficiency from "an allocator never allocates address 0" โ€“ all allocators have some amount of internal state that they can't allocate because they're using them themself, so if address 0 truly is non-special, you can just have the allocator use it to store its own state, and then no memory is wasted.)

I think the problem is that you're assuming "because address 0 isn't special in the hardware that means that software needs to be able to use it as a general-purpose address". But the conclusion doesn't follow from the premise: even if the hardware doesn't treat address 0 specially, all software will have at least some addresses that it treats specially (allocator internal state, stack, register spills, certain sorts of special global variables like the TLS root, etc.), and it can't use any of those addresses as general-purpose addresses. So there are always going to be some addresses that you can't form pointers to or use with the normal std::mem functions. If you have control over your own memory map, just place one of those special addresses at address 0 โ€“ and if you don't, you can't use address 0 anyway. In neither case does Rust need to be able to support general pointers to address 0.

(A good comparison: the x86 series of processors has both physical and virtual addresses, and although physical address 0x0 is special, virtual address 0x0 isn't. Nonetheless, all sensible 32-bit and 64-bit operating systems design their memory map in such a way that the address in question is always unmapped โ€“ there are plenty of other places to put things and being able to catch null pointer dereferences is just too useful (in particular, the kernel cares about any potential null pointer dereferences in the kernel hitting a mapping error rather than potentially running non-kernel code). 16-bit operating systems care about space to the extent that they would consider using it, but normally virtual address 0 is used to hold things like system call interfaces that need to be accessible in memory but which aren't available for general-purpose use, meaning that 0 can still be used as a null pointer because it would never be used as a general-purpose memory address.)

12 Likes

Good point - in my case, Identity map was a software architecture. Yes, I could choose HHDM or something similar. But my strongest argument here is a hardware whose RAM layout is a constraint, not a configuration. Arm Cortex-M with 2 kiB of RAM and no MMU have no RAM layout to "choose". Telling those users to "just put something special at 0x0" is effectively telling them Rust won't support their hardware.

You still need to choose the RAM layout when programming for such processors! In fact, it's probably even more important, to be able to fit everything in.

In particular, the toolchain needs to know what the memory layout is in order to be able to generate code correctly. (Rust can't "automatically" set up the address space on bare-metal systems โ€“ you need some sort of entry point code that at a minimum sets the stack pointer correctly.) Normally this is done using linker scripts, which allow you to outright specify things like what addresses are used for the stack, and put your startup code at the correct hardware address so that the hardware knows how to run it.

On such systems, you probably wouldn't be using a memory allocator at all (it's possible to write one but usually not worth it), and as such you know that pointers will only ever exist to global variables or the stack. So it's usually very easy to set up your conventions for using address space in such a way that you never form a pointer to address 0. (And as I said above, if you are using an allocator, you can just place its internal state at address 0.)

So in a sense, Rust doesn't support this sort of hardware "out of the box" โ€“ you need some sort of assembly-code wrapper and a linker script in order to correctly interface your software with the hardware. (You need that in other compiled languages, too, unless the compiler special-cases the specific model of hardware you're targeting.) But when you're doing that, you can make address 0 appropriately special at the same time.

4 Likes

It's "easy" when you have mebibytes of RAM. On embedded hardware like Cortex-M series, every single byte counts. Avoiding 0x0 itself is a cost.

Moreover, C's null pointer representation is implementation-defined. A freestanding C compiler can treat 0x0 as a valid address on targets where it is one. Rust cannot, because the invalidity of 0x0 is hardcoded into the language semantics, not even the target.

you can just place its internal state at address 0.

You'll need to construct a reference to access it with any benefit of Rust's type system, and that's instant UB. Even raw pointer operations like ptr::copy and ptr::write_bytes are UB on 0x0. There is no escape.

Cortex-M a) is 32-bit and b) maps 0x0 to the code memory that is used to boot the system โ€“ you aren't going to want to form a reference to that (what would such a reference be useful for?).

You are pointing out a deficiency in Rust, but the deficiency in question is "if you program an entire software stack entirely in Rust, including the kernel and bootloader, there are some points at which you will have to use raw pointers and volatile accesses rather than regular Rust references". I'm one of the biggest advocates of "let's make safe Rust more powerful so that we can use normal type-checked references rather than needing to unsafe code", but in this case unsafe code is inherently going to be involved โ€“ you are configuring a processor, doing that is inherently expected to have weird side effects if done incorrectly, it is very difficult for a computer to know which configurations are sound and which are unsound, thus you will need to write it unsafely. If you think about it, you wouldn't be getting any useful type-system guarantees anyway from having a reference to an important global variable in the kernel or an important memory-mapped hardware address โ€“ if you set it to the wrong value, things still go wrong, even if the provenance and lifetime and type are all correct.

And if you're going to need to do something unsafe to correctly set up the hardware (which you are), needing to do something unsafe to access address 0 is not a big burden on top of that โ€“ a sensible programmer doesn't use address 0 for general-purpose use anyway because there would be a huge performance loss from doing so (in that it would make null-checks everywhere in the program slower) and because there's no real cost to using it for special-purpose use instead. If you're setting up bare metal from scratch, there are almost necessarily going to be some things you can only do using unsafe code, so whatever you use address 0 for, you may as well make it part of those.

(Note that through all this discussion, I've been assuming that either you have a system with separate code ROM and data RAM or that something about the system inherently prevents you putting code at address 0. If the code and data are in the same address space, you can just make 0 a code address that isn't a function entry point, and then you would never need to form a reference to it or read/write it.)

6 Likes

The scenario you describe - 0x0 as a special-purpose address - is what the Demand section already refuted. My RFC concerns environments where 0x0 is ordinary RAM and every single byte counts, not an interrupt vector or boot ROM.

and your solution to put the code at address 0x0 is exactly "shifting the language's responsibility onto the user".

This is what you wrote in the Demand section:

This doesn't list any platforms on which being able to form a reference to address 0 would be useful:

  • Your identity-mapped operating system is, presumably, going to be using at least one byte of memory for a global variable or for its own code. Being an operating system and thus able to design its own ABI conventions, it is trivial to put something at address 0 that does not ever need a reference to be formed to it โ€“ in practice, all operating systems will use at least some global variables that never move in memory and thus prevent their addresses being used by anything else. (And on many processors, physical address 0 is special to the processor, so if you use an identity map, virtual address 0 will be special to the processor too and thus you would be forced to use it for whatever purpose the processor uses it for, and forming a reference there could be very dangerous.)
  • ARM Cortex-M maps the bootloader's code in the segment starting at address 0, and it gets automatically mapped there by the hardware on startup (from a source that's configured using physical wiring because it can't be configured by the bootloader for obvious reasons). There is no purpose to having a reference to the bytes of code that make up the bootloader โ€“ reading it is almost useless and writing it would effectively be a firmware reflash, a very unsafe operation. So on Cortex-M, you are not forming a reference to that address, ever.
  • I've worked with embedded systems with less than 200 bytes of RAM (plus a few kilobytes of ROM). On the system I worked with, address 0 was special (theoretically a valid address, but pointers to it didn't work because it was memory-mapped over part of the logic for dereferencing pointers). If it hadn't been special in that way, it wouldn't have mattered โ€“ with RAM that small you are storing everything you can in global variables so that your code can use hardcoded addresses rather than paying the cost of storing the addresses, so I would just have put a global variable there (that was accessed hardcodedly rather than through references) in order to still have a valid null pointer.
  • On bare-metal environments you don't have an OS or runtime support libraries, so you will have to create an equivalent to be able to run code. This cannot by definition be provided by the language โ€“ because it if were, it wouldn't be a bare-metal environment any more. Your runtime support library will have to use at least one byte of memory, so you may as well make 0x0 part of that.
  • On AMD64, 0x0 is indeed a valid physical address. Doing a non-atomic write to it would inherently cause a race condition (because the processor can atomically read it at any point, if it needs to service an interrupt). Also, by the nature of the hardware special case for it, it is likely to be written once by the bootloader, once by the kernel, and then never again. Needing to use a special instruction for the purpose doesn't seem like a burden here โ€“ and again, you are never going to want to form a reference to it.

I think your Demand section is undermining your own request โ€“ you've basically made a request to allow references to address 0, and then given a list of circumstances in which forming such a reference would either be useless or actively harmful. Some situations are inherently doing something unsafe, and as far as I can tell, every situation in which you might want to access address 0 is either inherently doing something unsafe, or else a situation in which you can easily make decisions in such a way that you are accessing some other address instead. Your list of examples is not refuting that at all.

If you want to push this further, I would suggest that you specify a concrete, specific processor and scenario for using that processor for which the following statements hold:

  • it is possible to access address 0;
  • the processor does not give address 0 a special meaning that would make it inherently unsound to give safe code arbitrary unchecked access to read/write it;
  • in the scenario in question, the programmer does not have enough control over the memory map to be able to put something at address 0 that you would never want to form a reference to (like variables belonging to runtime support code or code for startup routines), and the memory map that the programmer is forced to use doesn't naturally put such a thing at address 0 either.

Also state the assumptions you have about a) how programs are loaded and run in this scenario, and b) who or what is responsible for specifying the memory map while the program runs (and in particular how much ability the programmer would have to change it).

So far, you haven't given any such examples โ€“ you've just been repeatedly stating without proof that they exist (and referred me to a section containing a number of incorrect examples), which is not very convincing. If you want an RFC to succeed, your motivation section will need to contain at least one correct example, and probably more โ€“ otherwise it will look unmotivated.

6 Likes

I have two.

First, RISC-V defines no standard RAM layout which means it's entirely implementation-defined. A conforming RISC-V implementation can place RAM at 0x0, and such implementations exist (e.g. the Micro RISC-V core, 0x0โ€“0x7fb). Whether a particular implementation is a teaching core or a commercial SoC is irrelevant - the point is that the ISA permits it, and Rust should not prohibit what the architecture allows.

Second, the BCM2835 (Raspberry Pi 1, ARM1176JZF-S, tens of millions shipped) maps physical RAM starting at 0x0 per the official Broadcom datasheet. Yes, it has an MMU, but an identity map is a legitimate bare-metal design choice, not a workaround. And requiring every OS developer to remap around a language-level assumption is precisely "shifting the language's responsibility onto the user."

If Rust intends to be a general-purpose systems language across these targets, "sensibility" must never be assumed.

1 Like