[MIR] constant evaluation

eddyb · February 10, 2016, 7:35am

I’m sorry for being late to this discussion, given that @nikomatsakis and I have gone over this in slightly more detail that is available here.

I mostly agree with @matthieum in that allocations are more useful than full-blown IO, but there is a deeper issue: purity.

To be able to use const fn with the typesystem (i.e. [T; ctfe()]), const fn functions must be pure, in the sense that repeated calls with identical arguments must produce identical results.

While it may be possible to design a system where impurity cannot affect type safety, it’s a non-goal for Rust 1.x.

As for claims of complexity, a MIR interpreter capable of evaluating everything but impure/IO code should be simpler than the MIR -> LLVM IR translator and a LLVM JIT/interpreter, and cleaner than the current constant folder (which has been getting better, thanks to @ker).

@scott has started work on an out-of-tree prototype named miri, which reassures me that MIR is great for tooling and that we can do a lot with an order of magnitude less code than, say, the name resolution pass.

Without further ado:

Rust/CTFE Abstract Machine

A proper subset of the Rust Abstract Machine, restricted to deterministic reproducible evaluation.

That is, only const fn functions may be called, which are required to always return structurally equivalent values and produce the same writes to memory reachable by arguments, given structurally equivalent arguments.

C functions are never const fn, as the above property (also referred to as purity) cannot be enforced, although a C-to-Rust transpiler could fill in that gap.

Writes to static (global) memory and inline assembly are also prohibited, and compile-time pointers do not have observable addresses.

The CTFE capabilities offered are similar to those found in C++14 constexpr.

Data layout

The machine works with octets as its fundamental unit of measurement (for sizes, alignments and offsets), which will henceforth be referred to as bytes.
It’s parametrized by the target architecture’s data layout, which consists of:

endianness (the order of bytes in a multi-byte integer value)
size of pointers and usize/isize
alignments of all primitive types

Based on the machine data layout, the layout (size, alignement and field offsets) of aggregate types can be determined.
The layout of non-#[repr(C)] aggregates is implementation-dependent.

Primitive types

Boolean, C-like enum, character codepoint, integer and floating-point values will be represented as exactly one valid value in their respective representable range.

Addresses (pointers, references, all non-local lvalues) will be one of:

integer addresses: x: usize in (x as *T).
Cannot be read/written through, but other operations do work (on the integer value), e.g.:

&(*(0 as *const Struct)).field as *const _ as usize

abstract locations: (region, x: usize) in region.offset(x) with x < region.size.
Can be read/written through, except for writes to static (global) memory.
Can only be compared with another abstract location from the same abstract memory region (the offsets are compared).
No bits of a pointer holding an abstract location may be observed as any other type than a pointer, for which the whole value must be used.

Abstract memory region

A contiguous block of memory with a fixed size, consisting of size constant bytes and a set of non-overlapping special values (undef bytes and relocations).

Undef bytes can be found in uninitialized memory (e.g. mem::uninitialized()) and are also produced by writing an aggregate, at positions not covered by leaf fields (i.e. padding bytes).

Reading undef bytes is prohibited, as code that does this is likely to exhibit undefined behavior at runtime.
Aggregate reads side-step this by only reading bytes covered by leaf fields.

Primitives are read/written at the byte level, and the combined bit pattern must represent a valid value (transmute is no different).

For non-pointer primitives, none of the bytes read may overlap with a relocation.
For pointers, either:

None of the bytes read overlap with a relocation, producing an integer address.
The entire pointer aligns perfectly with one relocation, producing an abstract location, where the constant bytes gives the offset, and the relocation gives the abstract memory region it’s pointing into.

Writing over a relocation is only permitted if the entire relocation is written over in the same operation, replacing it (e.g. a 32-bit pointer relocation can be written over with *mut T, u32 or (u16, u16) but not two u16 writes).

To give an example, the following two abstract memory regions would be produced to hold &[(0u8, 1u16), (2, 3)][1..], for a little-endian target with 32-bit pointers:

Region A:
Data:     04  00  00  00  01  00  00  00
Special:  └──Reloc(B)──┘

Region B:
Data:     00  00  01  00  02  00  03  00
Special:     Undef           Undef

Dynamic memory allocation (optional)

An implementation can provide intrinsics to allocate new abstract memory regions, deallocate them (invalidating dangling references and preventing further uses) and resize them (which can always be done “in place”).

Special care must be taken to ensure that values of types that administer such memory (Box, Vec, String, etc.) do not escape into runtime code while holding a pointer to an abstract memory region.

The implementation can generate runtime allocations from abstract memory regions when a const item “owns” the allocation, although this alone is not enough to get e.g. a &'static str from String CTFE.

EDIT: discussion resulted in the addition of the paragraphs below:

All abstract memory regions reachable from a &'static T can be placed in static memory, with the same restrictions as borrowing in a const item (no UnsafeCell), e.g.:

const WORKS: &'static str = &5.to_string();
const WORKS2: &'static str =
    Box::leak(5.to_string().into_boxed_str());

const FAILS_OLD: &'static AtomicBool =
    &AtomicBool::new(false);
// Arc uses 
const FAILS: &'static Arc<i32> = &Arc::new(0);

No runtime allocations shall be used, only read-only static allocations. static items could also be extended to allow thread-safe shared data, e.g.:

static FOO: Arc<Foo> = Arc::new(Foo {...});

One could only increment and decrement the refcount of FOO, but it would never go below 1, so the allocation being in .data would not be an issue.

The UnsafeCell is more subtle, though, as the following also needs to be forbidden:

static GOOD: Arc<Mutex<String>> =
    Arc::new(Mutex::new(String::new()));
static BAD: Arc<Mutex<String>> =
    Arc::new(Mutex::new(String::from("foo")));
...
mem::replace(BAD.lock(), String::new())

Now you have a runtime String pointing to .rodata.
The root of the problem is a “dynamic” abstract memory region being pointed to from memory owned by the UnsafeCell<String> instance.

I believe we can detect that by finding “mutable ranges” based on the types used to access a memory region.

That does mean, though, that unsafe abstraction providing const fn facilities has to be extra careful: keeping track of an allocation with *mut u8 instead of *mut T may prevent the compiler from detecting UnsafeCell.

We can warn about pointers not reflecting the full size of the allocation they point to (but this breaks Vec, unless that starts using *mut [T] instead, which we might want to do anyway, e.g. to support Vec<DST>), and only point to .rodata in those cases, which would result in page faults instead of unsafety.

The other danger I can see here is querying jemalloc for some information about the allocation of a String and failing (Servo does something like this, cc @Manishearth).

Topic		Replies	Views
How to handle constants in the compiler compiler	18	4887	October 19, 2015
Const-first functions language design	26	1200	July 24, 2026
Thoughts on Compile-Time Function Evaluation and Type Systems language design	31	5789	August 8, 2018
SIMD in `const` easily language design	24	1647	September 4, 2024
Const fn + proc macros language design	29	3900	May 25, 2021