Role of UB / uninitialized memory

What do you mean exactly by "trigger", and can you show an example of this happening?

... I don't know how to put that any more clearly. What do you want to call it when the compiler starts pruning entire control flow paths because they can never be executed without executing a construct whose behavior is undefined?

Sure:


int foo(void);
int bar(void)
{
    int x;
    if (foo())
        x = 3;
    return x;
}

compiles, using GCC 6.3 for x86-64, to

bar:
	subq	$8, %rsp
	call	foo@PLT
	movl	$3, %eax
	addq	$8, %rsp
	ret

See how the return value is unconditionally 3? This optimization would not be allowed if reading an uninitialized variable produced a merely "unspecified" value. If you tag the declaration of foo with __attribute__((pure)) (this means "I solemnly swear that foo has no side effects"), it won't even bother calling foo. And, due to a long-standing pass-ordering problem, you don't get a warning about the use of an uninitialized variable! (The lack of a warning is GCC bug 18501.)

Clang 3.8 does the same thing (the generated assembly language is not quite the same but that's only because it does the stack alignment a little differently), but at least it does warn about the uninitialized variable.

2 Likes

This is not a memory access. Uninitialized locals and uninitialized memory (indirectly accessed through a pointer) have very different properties. In particular, there is no way for compiler to discern what's on the other side of the pointer (apart from the declared type), so it can't make any optimization that take that into account.

I understand why you think that, but it's not true anymore. This small modification of the program I posted earlier...

#include <stdlib.h>
extern int foo(void);

int bar(void)
{
    int *x = malloc(sizeof(int));
    if (!x) abort();
    if (foo())
	*x = 3;
    int r = *x;
    free(x);
    return r;
}

compiles to...

bar:
	pushq	%rax
	callq	foo
	movl	$3, %eax
	popq	%rcx
	retq

... with clang 3.8. (gcc6 isn't this clever.)

I suspect clang/LLVM managed to do this by "lowering" the malloc allocation to a local variable, thus converting the program into the previous program, but if there were a distinction between uninitialized locals and uninitialized memory, that transformation would be invalid.

5 Likes

To get this to a somewhat more high-level point, LLVM has undef and poison values that both have very special behavior, and in particular, neither of them corresponds to any particular bit pattern. The sad truth is that we cannot have both “everything is a bit pattern” and keep all the fancy optimizations compilers do. Rust will have to have something similar. miri has Undef, which ironically is more like LLVM’s poison than LLVM’s undef (but that’s good, LLVM’s undef is pretty crazy).

C decided to treat unsigned char* specially to make it possible to implement memcpy. Rust could do the same with u8, but that somewhat feels wrong – why u8 and not the other integer types? And this is why @nikomatsakis brought up this union type, which is supposed to express "either T or Undef":

union Uninit<T> { init: T, uninit: () }
1 Like

I concur, char in C tries to be too many different things at once and it just causes trouble. std::mem::byte maybe?

I might be missing something, but if you can't tell statically whether or not something is initialized, how does a union like that help? If you read init from uninitialized memory, it's still UB, and you have no way of knowing whether it is.

Perhaps I should clarify what I’m talking about. Let’s go with a very specific example.

[#inline(never)]
fn memcpy64(dest: &mut [u64], src: &[u64]) {
    for i in 0..src.len() {
        dest[i] = src[i];
    }
}

As far as I can tell, when this function compiles, the resulting machine code is always correct, even if the memory that src points to hasn’t been touched since the motherboard first powered on. Here, the function must never be inline because there is no other way to express the constraint that compiler doesn’t make assumptions about src, but it shows that “uninitialized memory” is a dubious concept. Making those reads “UB by definition” doesn’t seem well motivated. There is no actual “uninitialized memory” in a running computer.

The idea is that you wouldn't read the init field. You have a memcpy somewhat like this:

fn memcpy64(dest: &mut [Uninit<u8>], src: &[Uninit<u8>]) {
    for i in 0..src.len() {
        dest[i] = src[i];
    }
}

I have to say though that this does feel pretty hacky. Not sure I am happy with this as the work-around to access Undef.

The machine code being correct doesn't mean anything except that the compiler is not "smart" enough to notice all instances of UB.

That's true. However, unitialized memory is still a very useful concept in describing the behavior of programs, in particular if we want to consider out programs at a higher level than assembly languages do. I suggest you do some Googling about LLVM's poison and undef, there should be numerous examples out there explaining why the LLVM developers considered it necessary to add these concepts even though they do not "exist" in a running computer. A brief search brought up this note and this first of a series of blog posts, for example.

in a running computer, pointers and usize are also "the same" type. If we constrain ourselves by what things are on a running computer, there are many valuable optimizations that we cannot do. For example, we could never move a variable access across a function call -- after all, the unknown function we call could "guess" the address of our local variables in memory, and access them. In a running computer, nothing stops them. But we don't want to permit functions to do that, precisely because we do want to perform optimizations. To this end, when we specify the behavior of programs, we imagine our computer to have "extra bits" that we use to keep track of things like whether something is a pointer or an integer, or whether memory is initialized. I also recently wrote a blog post motivating this kind of "extra state": https://www.ralfj.de/blog/2017/06/06/MIR-semantics.html

"Really" high-level languages like Java have a much easier time with all of this because they don't allow programmers to ever observe things like uninitalized memory. "Really" low-level languages like Assembly do not permit interesting optimizations, so they can get away with a very bare-bone kind of representation. Languages like C and Rust, on the other hand, want to eat the cake and have it, too -- they want to do optimizations based on high-level intuition while at the same time permitting the programmer to observe low-level details (like the fact that padding bytes exist). Properly doing this is a hard and unsolved problem.

The nomicon gives an exhaustive list of UBs in Rust, and it’s pretty short. “Reading uninitialized memory” is the odd man out. Every single one of the remaining sources of UB is motivated by pure technical necessity. But “reading uninitialized memory” is not UB by necessity, that one’s an arbitrary choice. It’s perfectly possible, even simple, to let Rustc emit code with no undefined memory.

Think about it. It’s already impossible to read or reference undefined locals on Rust level, except for that one ugly intrinsic. If the intrinsic didn’t exist, Rust would be one of those “really” high-level languages already.

On the other hand, the decision whether or not the heap allocator returns “uninitialized” memory is completely arbitrary metadata, and while it’s certainly unsafe to access that memory (interpreting it as certain types is UB), there is nothing that would make it inherently broken.

Edit: To clarify, I’m not arguing against the concept of uninitialized (heap) memory. I’m arguing that the blanket statement “reads of uninitialized memory are UB” doesn’t make sense. It should be unsafe, not UB. The result of std::mem::uninitialized() is the obvious exception, and should be considered as a separate category.

I don't think that's the case. Rust permits you to do all sorts of nefarious things with pointers, things that involve observing the offset of fields and how structs are laid out in memory. Have a look at the code of Rc::from_raw. Another example is HashMap, which allocates one large chunk of memory and then splits it into three pieces, namely, an array of hashes, keys, and values. This is only possible if you take a low-level view of memory, where everything is ultimately just a bunch of bytes.

Oh, I see. :slight_smile: I misunderstood you then, sorry. What is your stanza on reading uninitialized memory at type bool? That type is an enum, which typically guarantees that the value is either 0 or 1. If you now pass an uninitialized bool somewhere, is it okay for that to be UB?

The point that mem::unitiailized and malloc yield "the same kind of" unitialized memory has already been made above in this thread. Existing compiler transformations will happily turn a malloc into a local variable if they can prove that it is not used after the function returns. I also feel there's little gained from distinguishing the two.

Definitely UB. There is no limit to the harm you can make by breaking type invariants. This is not strictly limited to fundamental types or uninitialized memory either. Transmute from an initialized byte will do the same thing, and plenty library types have internal invariants that will cause UB if broken.

Rust doesn't use malloc() (not necessarily, anyway), and LLVM doesn't either AFAICT. It's just an external function that clang in particular has special optimizations for. Doesn't mean it's a good idea for Rust to treat heap the same way.

I mean, what are the odds that your code just happens to not require the heap allocations you are making, and you never noticed?

Less UBs in tricky unsafe code is IMO a gain well worth the distinction.

What I meant is, it doesn't allow any way to create undefined values you can legally read (apart from said mem::uninitialized(), and putting aside the ambiguity of heap allocation). Of course you can always use unsafe code, to create pointers to places on stack you shouldn't see, but that's UB by virtue of breaking the aliasing rules I think.

The idea of treating heap memory and stack memory differently w.r.t. UB is an interesting one. It goes completely against my intuition, but it took me a while to understand why. I believe the reason why stack and heap memory should be the same in this regard is that all memory should be interchangeable. I don't mean that the optimization discussed earlier (demoting mallocs) should be legal, I mean that I want to choose how I allocate memory solely based on considerations like allocation lifetime and performance and availability (e.g. whether I even have a heap), not by language lawyering about UB.

Especially when I'm doing low-level, unsafe work where I treat something as a bag of bits and do dangerous things with it, I don't want to care where in memory those bits reside. I have enough trouble getting the actual bit bashing right. I don't want to track down a misoptimization caused by moving a temporary buffer from a Vec to a stack-allocated array (or the other way around).

The virtue of treating heap and stack alike also shows elsewhere: Your proposed memcpy64 is only well-defined (under the proposed "reading uninit heap memory is okay") if it copies from a heap allocation into another heap allocation. IIUC, the inline(never) is supposed to force the compiler to conservatively assume the heap-heap case, but that's not how this works. Language semantics must be defined independently of "what the compiler can see" and if you say "reading uninitialized bits from the stack is UB", then calling memcpy64 with stack buffers as arguments is UB, period. (And this will inevitably be exploited by a compiler author trying to squeeze out a few percent out of a benchmark suite.) So a rule that distinguishes stack and heap memory makes it actually harder to write a correct memcpy, because it's easy to write something that's only UB for stack memory (i.e., where it's even harder to observe the UB).

Of course, one could invent arbitrary semantics that are aimed at preventing certain reasoning steps by the compiler. For example, one solution would be to draw a line at function boundaries (perhaps only those that are inline(never)) and say that pointer arguments are to be treated "as if" they were referring to heap memory. Besides being very inelegant and throwing the whole object model into disarray, this will inevitably have fallout for the compiler's ability to analyze and optimize. So I am not a fan of this approach either (at least in principle, I won't rule out that there's some twist that is relatively simple and brings much advantage.)


PS:

That can easily happen in overly general code spread over multiple functions. But aside from the specifics, it doesn't sound very appealing to rule out optimizations just because they sound like good code shouldn't need them.

1 Like

That's safety vs. definedness again. The language should not "guess" your types' semantics - if you don't use any "magic" attributes, the language should only care about whether your types are "syntactically" valid - e.g. you should be able to use an RcBox to store 2 random words (as long as you don't call any Rc functions in a way that would crash).

That's not what I said, you are attacking a strawman. All memory is same, but pointers have different sources, and they don't necessarily lower to memory access. Specifically, if you reference a local, LLVM is allowed to optimize the memory access away completely in certain cases. In order to do so, it must know statically which value is in the memory, which basically means that the memory is allocated and accessed in the same function (that's why inlining matters in my example).

The problem is that if LLVM knows that the memory is not written to before read, it can do anything it wants with the access, but this is only doable with memory allocated with alloca, and the only way to achieve this in Rust is std::mem::uninitialized.

Heap allocation (not heap memory) is different because it's an external function with no special LLVM semantics. Defining heap allocations to be uninitiated is an explicit special case in Rust's definition (lawyering by the language, as you put it). In fact, all memory coming from heap allocator is just subdivided regions of initialized memory provided by the OS. There are no LLVM optimizations taking place, and Rust-specific optimizations don't​ benefit from uninitialized reads being UB, since any such optimizable code would almost certainly be invalid by definition, not to mention that most uses of heap go through safe abstractions anyway.

If we're only talking about accesses, then heap vs. locals (or, if you prefer to talk about that, the means of memory allocation) is not inherently relevant. The compiler is, and absolutely should be, allowed to remove (or introduce, or shuffle around) memory accesses no matter the source. More importantly, it's absolutely possible for LLVM to deduce that a load through a pointer yields uninitialzed bytes even without seeing where that memory is allocated. Consider this function for example:

unsafe fn make_and_read_undef(p: *mut u8) -> u8 {
    let u = mem::uninitialized();
    *p = u;
    *p
}

Even leaving aside any knowledge of uninitialized memory, it ought to be possible to optimize that code into this:

unsafe fn make_and_read_undef(p: *mut u8) -> u8 {
    let u = mem::uninitialized();
    *p = u;
    u // <- changed
}

... which has the effect of making the memory access *p yield uninitialized bytes, regardless of how and where it was allocated. So as long as "all memory is the same" and there is some source of uninitialized memory, any memory can be made uninitialized, unless we start randomly restricting useful optimizations such as store forwarding.

Again, whether something is UB or not is a semantic question, and so if we want to make certain operations defined or not, we need to lay out the criteria in terms of Rust semantics. What some compiler does or doesn't know, and what it does with that knowledge, is only relevant when testing whether said compiler is faithful to the semantics. These two concerns only intersect when it comes to judging how practical some proposed semantics are.

I tried to guess at what semantics you are proposing by reading it as "reading uninit stack memory is UB, reading uninit heap memory is fine" but apparently that was wrong. I apologize and would like to learn what semantics you are proposing. I believe that would be easier for me if you explained these semantics separately from how they can or cannot be implemented on LLVM.

This is not entirely true, LLVM knows about the symbol malloc (among others) and will treat it as the C function of the same name. So Rust code using the system allocator will also be affected by this, assuming some interprocedural analysis or inlining (which we'd like to have, at the very least to avoid additional overhead from wrapper functions compared to calling malloc directly—zero cost abstractions and all that jazz) .

1 Like

To add to what @hanna-kruppe said, the C standard explicitly says that memory returned by malloc is uninitialized: "The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate." The motivation for this is to abstract away from details like allocating memory from the OS, and describe what all C programs can rely on when calling malloc.

(Where "it" is "Rust".)

That's not entirely true: If you write something like let x = Struct { f1, f2, f3 };, the padding between the fields can well remain uninitialized. Still it should be legal to pass &x to memcpy to access mem::size_of::<Struct> many bytes.

Right, I kind of internalized what @zackw said

I can't see why anyone would want to disagree with this, so I stashed it away to "resolved issues" in my head.