Formal specs and optimizations in general

OK, then lets take a quick vote. Is making a fully formal spec too much effort?

@RalfJung, I think most people here know about your work with RustBelt; what do you think would need to be done to formalize LLVM IR, and then base Rust off of that formal spec?

For the record, I think that while formalizing LLVM IR would be difficult, it is the right thing to do.

As an alternative, is it possible to develop a new, minimal IR that LLVM IR (and Rust) can be defined in terms of? Would that be easier to develop and define (and more importantly, mechanically check with a theorem prover like Coq)?

The question is mostly one of working with the LLVM community to resolve the ambiguities and contradictions in the current specification. We made a concrete proposal for a spec of the subtle parts of LLVM IR that are involved in the discussion here, but this is probably not the spec -- it is just a basis for discussion.

This process is tedious and slow. For example, getting freeze into LLVM took 3 years from when this paper got published. But it is unavoidable; there is no way to just say "here's a formal spec of LLVM, that's it"; this cannot be imposed onto the LLVM community, it needs to be worked out with them. I barely know anyone in that community, so I am not the right person to do that work. Nuno and John, and other people like Juneyoung, are doing great work here, but it takes time.

Officially formalizing these aspects LLVM IR only makes sense once the informal spec is unambiguous and free of contradictions. At that point, formalizing it e.g. in Coq is a matter of putting in the work, which could take a PhD student few months to a year depending on their prior background and how far the spec goes in covering all of the IR. A formalization on its own is not terribly useful either, it needs to come with a way to test that the implementation matches the spec, ideally one cam prove some theorems... there's many things one can do here, but I think the time is not yet ripe for that.

Summary: Formalization doesn't magically make these hard problems go away, and in particular a formalization that makes no attempt at actually matching what LLVM optimizations do is pretty much useless -- and what LLVM optimizations do is not self-consistent, so LLVM needs to change before a formalization can have any chance of matching what LLVM does. This requires working closely with the LLVM community.

I agree with the earlier parts of your post, but the last paragraph is highly speculative. What about ((x+3)-x)-3? Now add other arithmetic and bitwise operations to this, and these questions become quite complicated. You are basically suggesting to turn part of the program into a symbolic computation and decide something about it; I can feel undecidability lurking around the corner. So if the goal is to keep our semantics decidable, I am far from convinced that it is "possible".

And that is not even considering the problem of tracking this symbolic computation through memory or whatever other means of communication there are. How are symbolic names even assigned to values at runtime? What if the program writes x to a file and reads the content back as y and then computes x-y? Your proposal raises more questions than it answers. :wink:

It seems much more acceptable to lose such transformations (converting between control and data dependencies, arithmetic simplifications) on pointer types than on integers (and you seem to agree, based on your last post). That's why I think "provenance on pointers only" is a reasonable trade-off.


I agree with all of this:

and with this:

but not with this:

The difference is that the former defines what is true or false. With work, you can prove (or disprove) a large number of statements1. The latter is a set of relations that need to be proven over the (formal) IR. At the end of the day, an optimization is a mapping from some input to some output. If you can prove the optimization is correct over the formal IR and its rules, then all2 you have to do is prove that the implementation correctly matches the optimization.

In short, a formal IR is more fundamental than, and I would argue, more useful than the LLVM optimizations.

1 Yes, there are such things as undecideable statements, but based on the useful results mathematicians have successfully churned out for millenia, I suspect that the body of decideable problems is large enough that we can profit from a formal spec and a logic over it.

2 For incredibly hard values of all.

1 Like

You're right, I wasn't thinking about it clearly enough, and explained myself poorly. Let me try again.

I guess what I'm thinking about is more along the lines of what I messaged you about privately earlier; if we had a formal IR, then we could use it as the hub of a hub and spoke of language definitions. We could define Rust (and any other language) in terms of this formal IR (FIR). Compiling would be Rust -> FIR -> LLVM IR. Proofs could be done using the FIR, but actual compilation would be done via LLVM IR. The mapping between FIR and LLVM IR would have to be constantly maintained to deal with any changes in definitions of the LLVM IR, but assuming that the FIR was designed properly, it wouldn't change.

The issues with this are obvious:

  • What, yet another IR‽‽‽
  • Is it actually possible to capture all the concepts we need in a formal IR?
  • Would it be more useful than just trying to use a theorem prover directly?
  • What happens if the LLVM IR shifts such that we can't represent a concept in the FIR? Is it possible to extend it without breaking it?
  • etc., etc., etc.

I don't know the answers to any of those; the only thing that I know is that proving things on a shifting and ambiguous foundation doesn't really work.

I think what's needed is to replace the notion of optimizations which aren't allowed to affect the behavior of correct programs, with optimizations which would be allowed to make certain specified transformations in the absence of anything that be specified as blocking them, without regard for whether they would affect program behavior. If transformations were specified as creating barriers to any other optimizations that would be incompatible with them, then optimization steps could safely be performed in any order even if transformations introduced thereby would be incompatible (since the transformations in the earlier steps would generate barriers to any incompatible transformations later steps might otherwise attempt).

For example, an IR could include a comparison operator with semantics that would allow a compiler to treat any comparison between two pointers that were known to be based on disjoint objects as unconditionally yielding false, and also specify that if two addresses are actually compared and found to be unequal, accesses to the storage thereby may be treated as unordered, provided that the output from the first transform included a barrier to any inferences based upon the presumed value of the comparison.

Basing the legitimacy of transformations upon program structure and the presence or absence of anything that would block them would make it much easier for a compiler to identify legitimate transformations and avoid illegitimate ones, while at the same time expanding the range of semantics available to programmers. Further, such an approach would allow many optimizations to be applied to correct programs which would otherwise only be applicable to useless ones. If parts of a program's output will be meaningless (common when writing binary files), allowing a compiler to substitute different meaningless data may allow program requirements to be met more efficiently than would be possible if it had to compute the meaningless data using a precise sequence of steps.

Yep, that was what I meant.

This would be two bug reports then, I guess..
I can do those tomorrow.

1 Like

You need to put some bounds on what optimizations are allowed to do. It is absolutely crucial that as a programmer, I can look at the source code of a program, reason about its correctness, and then rely on the compiler to maintain that correctness during compilation. This is the fundamental, and crucial, property we get from saying that optimizations must not change program behavior.

So, if you propose to drop that notion, you have to propose something else that still lets us be confident that the compiled program will do what we want it to do, without having to inspect its assembly code.

As the examples in my post shows, basing whether or not a compiler is correct on the idea that "it may perform this set of optimizations" is a very bad idea: even a small set of optimizations can quickly lead to unexpected program behavior.

So, I agree with your statement -- that proposal would make it much easier for a compiler to argue that it is correct, because it becomes correct by definition. However, it also makes the compiler useless, because there is no longer any reasonable way for the programmer to predict the behavior of the final program. What your proposal (if I understood it correctly) does is to basically just say "this optimization is correct because I say so; if it breaks your program, tough luck".

Putting it differently: clearly a compiler that permits the three optimizations I used in my blog post is bogus. I can change program behavior in a way that the final program does not what the programmer intended it to do. How do you propose we make sure that the compiler is not bogus, if not by saying "it must not change program behavior"?

Speaking of which... "it must not change program behavior" is not actually what we expect from a compiler. The exact phrasing is more along the lines of "it must not introduce new behaviors". So it is certainly possible to write a spec which says "operation X may do A or B", and then the optimized program will always do A, but in debug mode it does B. Even running the same operation twice in a single build of the program can do A once and B the other time (maybe because the optimization only kicked in in one case). This is called non-determinism, and it is a crucial tool to avoid overconstraining a specification. Maybe what you are proposing is to use non-determinism more liberally in specifications. I can see that point, but as an abstract comment it is not very helpful -- non-determinism has benefits and costs, so it needs to be evaluated case-by-case for each concrete proposal of making some operation non-deterministic.


You need to put some bounds on what optimizations are allowed to do.... So, if you propose to drop that notion, you have to propose something else that still lets us be confident that the compiled program will do what we want it to do, without having to inspect its assembly code.

I agree there should be bounds on what optimizations are allowed to do. The problem is that present discussions are focused on determining the circumstances where optimization may be observably affect the behavior of a program, while failing to recognize any limits as to what may happen when such situations arise.

Returning to the endless loop example, suppose a program is subject to the following constraints:

  1. It should behave usefully when possible ('should' rather than 'must' to avoid having to precisely define "when possible").
  2. When it can't fulfill its primary purpose, it must behave in a fashion that is at worst tolerably useless.

If receipt of a certain input causes a program to get stuck in an endless loop, that might render the entire execution useless, but generally not intolerably worse than useless. If one gives a compiler carte blanche to behave in arbitrary fashion if a program's inputs would cause it to get stuck in an endless loop with no side effects, then the only way to meet requirements would be to ensure that every potentially-endless loop has side effects. If instead one allows that a compiler may defer or omit the execution of certain loops without regard for whether they terminate, without granting special permission in cases where they do not, then compilers would be able to exploit such license much more often within correct programs.

As another example, consider something like:

struct string63 { char st[64]; } x,y;
void test(unsigned number)
  struct string63 temp;
  sprintf(, "%u", number);
  x = temp;
  y = temp;

Following a call test(23), what if anything should be guaranteed about the last 60 bytes of and One could interpret the Standard as saying that behavior is undefined because code copies temp before all portions thereof have been written, but requiring that a programmer assign values to every byte of temp before copying it would hardly make the code more efficient. If one didn't regard the above as invoking UB, however, the fact that an Indeterminate Value is "either an unspecified value or a trap representation" would imply that code must behave as though it stores the same 64 byte values to x and y.

Suppose instead one allowed a compiler to, at its leisure, omit actions that would copy parts of an automatic object whose value is indeterminate. The effects of such an optimization may be observable if e.g. a program were to store different values into[63] and[63], call test(23), and then compare[63] and[63]. If, however, the fact that the values are different would not prevent a program from satisfying its application requirements, then both the program and optimization should be regarded as correct despite the fact that the optimization affected program behavior.

Putting it differently: clearly a compiler that permits the three optimizations I used in my blog post is bogus. I can change program behavior in a way that the final program does not what the programmer intended it to do. How do you propose we make sure that the compiler is not bogus, if not by saying "it must not change program behavior"?

First of all, I think the integer-to-pointer casts are a needless complication. Both clang and gcc will perform the "optimization" even if all such casts are removed.

Otherwise, I'd suggest that it may be appropriate to have multiple ways of comparing pointers, which the utility generating LLVM code could select based upon its semantic requirements:

  1. Most strongly specified would say that if two pointers identify the same address, they must compare equal, and if they don't they must not, but an implementation may not observably use possibly-coincidental equality to justify assumptions about the sets of objects that are reachable using the pointers. This should generally be the default.

  2. Next would say that if two objects are based on different allocations but have the same address, each individual action that compares them may arbitrarily report them equal or unequal, but generated code must behave in a fashion consistent with each individual determination. If generated code decides they are never equal, and thus omits the comparison and code which would performs some action as a result of them comparing equal, it would be entitled to ignore the possibility of such action occurring; if, however, it generates code which would perform some action if the pointers compare equal, it must allow for the possibility that the action might occur.

  3. Weakest would be to say that if a block of code will only be executed if pointers are compared and found to be equal, a compiler may within that block regard them as freely substitutable. This appears to be what clang and gcc actually do.

Each stage would allow some optimizations that would be precluded by the one above, but the later forms would be inadequate for some tasks.

This is called non-determinism , and it is a crucial tool to avoid overconstraining a specification.

In order to support it, however, an abstraction model needs to recognize the concept at a fundamental level. I'm not sure whether it is clearer, however, to say that if x holds Indeterminate Value and y, z1, and z2 are all automatic objects whose address has not been observed, then y = (x & 3) + (x & 9); z1=y; z2=y; would set y, z1, and z2 to the non-deterministic union of values {0,1,2,3,4,8,9,10,11,12}, or whether it would be clearer to say that a compiler may at its leisure process each assignment by either computing some value and storing it, or by causing the target to be treated as a symbolic shorthand for the right-hand expression, but with x replaced by "any value"). The two descriptions would both allow for the same behaviors, but I think it would be much easier to validate a compiler design against the latter description.

A major difference between pointers and integers is that when an address is produced via pointer arithmetic, exactly one of the operands will be a pointer. If one were to specify that the result of such arithmetic is "based upon" that pointer, then there would be no ambiguity about what p1+(p3-p2) would be based upon, even if a compiler happened to know that p2 and p1 were equal. If code were to execute p1[p2-p1] = 5;, a compiler could process that by storing 5 to the address given in p2, but because that address is formed by adding an integer expression to p1, it would need to record that the access was produced using a pointer based upon p1.

The most common scenarios where optimization could benefit from attempting to carry provenance through pointer/integer round trips are those which use such conversions to force pointer alignment. The need to carry provenance through such conversions could be eliminated, however, if in places where their performance impact would be objectionable, programmers rewrite constructs as something like T *alignedBy16 = (void*)((char*)somePtr + (15 & -(unsigned)somePtr). Such rewrites might be a nuisance, but a tool could flag all places where programmers appear to be performing pointer-int-pointer conversions for such purposes, allow programmers to indicate that such a rewrite would be acceptable, and apply the change if so. If one were to specify that inspection of just the bottom part of a pointer's representation need not be regarded as "leaking" its value, it would be fairly easy for a compiler to recognize that a construct like the above would yield a pointer based purely upon somePtr, and would not require that compilers treat future synthesized pointers as though they might be derived from somePtr.

I think formalizing llvm ir, and defining rust in terms of that, is a bad thing to do. A rust implementation should not be required to rely upon llvm, so the specification itself should not be defined in terms of it.

Rust does 100% need a formal specification of some kind, at least in my opinion. I don't think the fact that pointers are pointers changes that.

1 Like

You're right. I actually came to a similar conclusion earlier after talking with @RalfJung:

Yeah, I actually feel like the pointers question is a just a symptom of the problem. The primary cause is that without a formal definition, it's hard to do formal logic. Floating point is also a hard problem. Memory is a hard problem (different levels of caches with different accessibility, memory mapped I/O, memory mapping different address spaces together, etc.) In short, there are an endless series of hard problems, with pointers being just one small facet of the problem.

I think formalizing llvm ir, and defining rust in terms of that, is a bad thing to do.

Languages and dialects which regard Undefined Behavior as a basis for upstream reachability analysis are unsuitable for any purposes where they may be exposed to malicious data, or where malfunctions might have worse-than-useless consequences. Although C has never been so much a single language as a collection of dialects which are suitable for various platforms and purposes, the designers of LLVM have based it around a dystopian dialect which treats every decision by the Standards Committee to waive jurisdiction over some action as mandate that programmers must avoid that action at all costs, any program that fails to do so is defective, and that compilers should feel free to assume programs will never receive input that would cause such actions to be performed.

What is ironic is that such "optimizations" are not only extremely expensive and difficult to perform soundly, but they're often counterproductive except in cases where anything a program might do in response to maliciously-crafted input would be equally tolerable. If some integer calculation will never overflow when a program receives valid input, and any value the calculation might produce in side-effect-free fashion would be equally acceptable when the program is given invalid input, which way of treating overflow would allow optimal performance:

  1. Configure the compiler so that all integer calculations wrap in precise two's-complement fashion.

  2. Require that the programmer manually write code to ensure deterministic behavior in case invalid inputs would cause overflow, and behave in arbitrary worse-than-useless fashion if any input is received for which the programmer has not made accommodation.

  3. Allow the compiler to choose freely from among several ways of performing integer calculations, whose non-overflow behaviors would match but whose behaviors in case of overflow might differ in "weird" but bounded ways, without regard for whether the calculations would overflow or not.

LLVM is based upon the second concept, but to my mind it is by far the most useless of the bunch for any program whose behavior would be required to abide by at least some constraints for all possible inputs.

The exact phrasing is more along the lines of "it must not introduce new behaviors".

Many optimizations involve performing operations in a sequence other than what was requested. The popular way for language rules to allow such optimizations is to attempt to specify that in cases where the behavior of performing X and then Y might differ from that of performing Y and then X, compilers may behave in completely arbitrary fashion.

That approach has at least two major problems:

  1. The first, which often renders it unworkable, is that it is extremely difficult to 100% accurately characterize the situations in which the order in which X and Y are performed might be observable, without needlessly restricting the range of useful actions available to the programmer.

  2. The second is that if the only way to guarantee anything about program behavior would be for the programmer to force X and Y to occur in a particular order in all circumstances where the order might be observable, even if both orderings would be equally acceptable, the need for programmers to force the ordering will negate any freedoms the rules were intended to give the compiler.

If one specifies that certain operations may be arbitrarily reordered in the absence of barriers that would block reordering, but also that certain transformations may introduce barriers to further reordering or other transformations, saying any program behavior that could result from allowable combinations of reordering is correct would a much more straightforward way of describing all possible correct outcomes than anything else I can think of. Further, many rules regarding optimization involve trade-offs between the range of semantics available to a programmer and the range of circumstances where compilers will be able to apply optimizations. Specifying conditions when compilers may reorder certain constructs, and when they would be forbidden from doing so, would make it much easier to make such trade-offs intelligently. The notion that one should restrict the semantics available to programmers to try to facilitate forms of optimizations that might be invented in future represents a form of "premature optimization" which goes far beyond even the craziest kinds that led to the "premature optimization is the root of evil" quote.

Addendum: A fundamental problem with optimizations that regard all UB as equivalent is that the only way to validate that an optimization is correct is to prove that the resulting program's behavior will be defined for all inputs where it's not possible to prove that the original program's behavior was undefined. Unfortunately, there are a lot of combinations of programs and inputs which can neither be easily proven to invoke UB, nor easily proven not to have defined behavior [if some aspects of program behavior are Unspecified, proving that a program invokes UB would require finding an allowable combination of choices that would result in UB, while proving that a program's behavior is defined would require proving that none exist]. While such proofs may be possible in "toy" scenarios, many programs are far too complicated to make them even remotely practical.

I am not sure what exactly you mean by this. Usually, there are no limits. Most of the time, when an optimization changes program behavior, then (in particular when combined with other optimizations), it can change program behavior in arbitrary ways. (In my example: imagine replacing print(q[0]) by if (q[0] == 10) { ... }.) So I think it is somewhat pointless, once an optimization was found to be behavior-changing, to analyze this further -- the optimization is buggy under the current Abstract Machine spec, so either the spec or the optimization needs to change.

Maybe there are other ways to build a compiler, I do not know. Maybe there are other ways to build a bridge, different from what engineers are currently doing. But the fact remains that this is a way that works, and in the case of compilers, no other way has been proposed and demonstrated to work so far. So I propose we remain open for alternatives, but also we make a concerted effort at applying the one methodology that we actually know will work.

This is already the case. There is a theorem one can show about undef/poision along these lines. IOW, what you are saying perfectly fits into the paradigm of "optimizations must not change program behavior".

I disagree rather strongly. Any description in which "the compiler" is even mentioned is much harder to work with for what I consider the primary purpose of a specification: to enable programmers to reason about what their code can do. You are basically asking programmers to imagine every possible compiler and every possible sequence of choices that it could make, and then to consider all of them and make sure the code is correct everywhere. In contrast, precisely specifying a single (possibly non-deterministic) Abstract Machine gives a very concrete model that programmers can follow.

Your proposal is extremely compiler-centric. I don't think that's good. Programming languages are for programmers, and so specifications should be programmer-centric. It is not okay to off-load all the work of "imagining what the compiler might do" onto programmers.


I am not sure what exactly you mean by this. Usually, there are no limits .

Usually, there are, which is a problem because some particular useful optimization would require a guarantee that two constructs behave identically except in certain ways that won't matter, and another would require the ability to replace a construct with another whose behavior might differ in some corner cases, failure to recognize the concept of optimizations might affect program behavior, but only in limited ways, would make it impossible to combine the two optimizations safely.

Further, jump-the-rails UB makes it needlessly hard to validate the correctness of what should be some fairly simple transforms. For example, if C1 is some condition that may be slow to evaluate, replacing (if (C1 && C2 && C1) with if (C2 && C1) may easily improve program performance, but would only be safe if C2 could be guaranteed never to invoke Undefined Behavior. If a huge range of constructs invoke Undefined Behavior, that may be difficult. If, however, one were to specify that within code that is only executed when some condition is true, a compiler would be entitled to make certain assumptions about other related things, then it would be much easier to reason about such substitutions. If the behavior of C1 wouldn't be affected by making its evaluation conditional on C2, and evaluation of C2 had no side effects except for its effects within code whose execution is conditioned upon it, then replacing if (C1 && C2 && C1) with if (C2 && C1) would be safe, since it would be possible to guarantee that evaluation of C2 in cases where C1 is false wouldn't introduce any side effects (including UB).

This is already the case. There is a theorem one can show about undef / poision along these lines. IOW, what you are saying perfectly fits into the paradigm of "optimizations must not change program behavior".

Compilers are allowed to omit copying of any object which could not be read without UB, but such permission would allow a compiler to do anything within a correct program execution that it wouldn't be allowed to do anyway. My point was that if one tightens the semantics of indeterminate values to allow their value to be observed within some correct program executions, that would allow more efficient program execution than would otherwise be possible if a program were required to prevent observations of indeterminate values at all costs.

I disagree rather strongly. Any description in which "the compiler" is even mentioned is much harder to work with for what I consider the primary purpose of a specification: to enable programmers to reason about what their code can do. You are basically asking programmers to imagine every possible compiler and every possible sequence of choices that it could make, and then to consider all of them and make sure the code is correct everywhere.

In many cases ,especially within the embedded and systems programming fields, there are many tasks that can be done on some but not all systems. If programmers were required to ensure that their code will run correctly everywhere, it would be impossible to use languages to accomplish any tasks that couldn't be done on all systems that support the language. What is necessary is not to ensure that a program will run correctly everywhere, but rather that it will run correctly on those implementations that should be expected to usefully support it. Differences between implementations are not an unfortunate failure to adequately standardize everything, but are rather a useful consequence of the fact that implementations intended for different tasks can use dialects which are tweaked to optimally fit such tasks.

Further, most of the transformation are such that their effects would only be observable in situations that, under current rules, would invoke UB. In many cases, it may be easier for a programmer to avoid such situations than try to reason about whether all possible combinations of effects would meet requirements. On the other hand, there are some situations where reasoning about the effects of such situations would be easier than trying to prevent them, especially if a program can include barriers to limit such effects.

As a simple example, suppose it is known that at least two out of three data sets will contain valid data, but the remaining one may be corrupted in arbitrary fashion that may be indistinguishable from valid data. Suppose further that one needs to output a number which either matches what would have been computed for one of the valid data items, or is between the values that would result from the valid inputs. If a program performs the computation with all three data items and outputs the median, any value that could result from the invalid data item would be essentially as good as any other. If corner-case transforms cause the computation involving invalid input to yield a completely different value from what it would otherwise have done, but does not otherwise affect program behavior, then one of the following scenarios will apply:

  1. The new invalid values were both greater than both of the valid values, in which case the program will output the higher of the valid values, which is an acceptable behavior.

  2. The new invalid values were both less than both of the valid values, in which case program will output the lower of the two valid values, which is an acceptable behavior.

  3. The new invalid value falls between the two valid values, in which case the program will output some number that is between the lower and higher valid values, which is an acceptable behavior.

Processing inputs in a way which is precisely specified for valid and invalid inputs alike may cost more than allowing the computations on invalid inputs to yield an arbitrary value, but without offering any additional benefit. Letting programs generate code which could output arbitrary numbers for the invalid inputs may allow more efficient code generation than would otherwise be possible, but only if neither the programmer nor compiler need to worry about which inputs (if any) were invalid.

I thought you were talking about "limits to how much a buggy optimization can change program behavior" -- where I mean "buggy" in the sense of current compilers, i.e., an optimization is buggy if it is behavior-changing. You are proposing to declare such optimizations correct; I am saying that basically means we have a wild west where it becomes impossible to make solid statements for what the final program will do.

You keep bringing up individual examples of where this makes compiler authors life easier, and individual programs that could benefit from this. What I am missing is an overall plan for how to avoid that wild west situation. Predicting what happens when a bunch of optimizations are run, each of which may change behavior "a bit", is basically impossible -- my post shows how hard this can be even for just 3 optimizations.

Now you are mixing up the concerns of platform-specific behavior and optimizations/UB. Please let's stick to one topic, and open a new thread if you want to open this entirely unrelated can of worms.

Historically, UB may have been introduced to abstract over platform differences, but those times are long gone.


We could formalize LLVM IR, but the implementation which is used by the rust compiler needs to prove that it satisfies the rules of the formal llvm IR. And how to do that if llvm is written in C++, formalizing C++?

Formalizing Rust is also neat but likewise it would be kind if we know that our implementation of Rust that we use daily does satisfy these formal rules.

But in the end we simply assume that it follows these rules and that Rust on LLVM follows R_\lambda.

I agree with you completely! Prooving that the mappings and implementations are correct will be a truly herculean task. But without a formal IR, I don't see how to move forwards.

1 Like

I am saying that basically means we have a wild west where it becomes impossible to make solid statements for what the final program will do .

I'm not proposing that one simply declare that anything a compiler does is correct, but instead to notice that if one limits optimizations to those which will yield program behavior consistent with what would have been produced by sequential program execution, that will make optimization needlessly difficult and further give a programmer model that is harder to reason about than one which allows optimizations to change program behavior in some very particular ways.

Suppose, for example, that one wants to allow a compiler to optimize out loops that compute values which end up being ignored, but which cannot be easily shown to terminate. Which would accomplish this better:

  1. A rule that would specify that if no individual action within a loop would be observably sequenced with regard to some later action that is statically reachable from it, the execution of the loop as a whole need not be considered to be observably sequenced before that action either.

  2. A rule that declares that a program which enters a side-effect free loop whose exit is statically reachable, but whose exit conditions can never be satisfied, invokes Undefined Behavior.

It would be easy to contrive a program in which the former rule would cause a program that does not invoke UB to behave in a way that could not have happened if all of the steps therein were performed sequentially, but of the three groups of programs:

  1. Those which would only meet requirements if such loops were executed in purely sequential fashion with respect to the surrounding code.

  2. Those which will never enter such loops with any inputs that would cause them to loop endlessly, in any situation where not even the most malicious possible way an implementation might process the situation would be intolerably worse than useless.

  3. Those which would meet requirements if the execution of such loops was deferred or omitted until/unless later code would observe their side effects, but would fail to meet requirements if endless loops could allow the execution of malicious code.

What I am missing is an overall plan for how to avoid that wild west situation. Predicting what happens when a bunch of optimizations are run, each of which may change behavior "a bit", is basically impossible -- my post shows how hard this can be even for just 3 optimizations.

If a language were to specify that certain constructs may be generally replaced by near-equivalents whose corner-case behaviors were allowed to vary in particular ways, in the absence of barriers or other constructs that would block such substitutions, that would be much easier for programmers to deal with than the notion that if a program does something for which the compiler wasn't prepared, the compiler might make arbitrary changes to program behavior. Further, if the behavioral changes wouldn't usually cause any trouble, but a programmer knew of places where code was relying upon precise corner-case behavior, directives to block transformations in those places could have clearer semantics than directives which would try to make actions that would otherwise invoke UB not do so.

Further, I would say that what your post shows is that the concept that transforms must only change program behavior if there exists some individual action that invokes UB is unworkable, as evidenced by the fact that in the 15 or so years that compiler writers have been really trying to push this approach, there is still no systematic way to identify which optimizations should be considered safe or unsafe. Further, if it's unclear under exactly what circumstances a piece of code should be regarded as invoking Undefined Behavior, it will be very difficult to formally say anything about the correctness of any transformation involving that piece of code. If no stage is allowed to generate code which cannot be proven to be free of Undefined Behavior, except in response to inputs that can be proven to invoke UB, the range of allowable optimizations would be unacceptably limited. If instead one looks at what outputs may result from what inputs, treating ambiguities in "safe" fashion (ensure that the set of behaviors which might allowable consequences of the output from the optimizer is a subset of the set of behaviors which definitely be allowable given its input), such treatment of ambiguities would have a much smaller effect on the range of achievable optimizations.

Historically, UB may have been introduced to abstract over platform differences, but [those times are long gone]

The "modern" ways in which compilers use UB for "optimization" are not only dangerous, but counter-productive outside a few niche situations. There are some programs whose response to certain invalid inputs forms no part of their requirements. Anything such programs might do would be considered acceptable. The design of gets() shows this philosophy quite well. Optimizations which assume a that all ways of handling a wide range of corner cases would be equally acceptable may be useful in situations where all ways of handling such corner cases would in fact be equally acceptable, but are worse than useless otherwise. If a wide but not unlimited range of corner-case behaviors would be equally acceptable, an optimization which allows an implementation to select freely from among tolerable behaviors would allow a program to meet requirements without having to ensure that such corner cases never occur. Optimizations which would allow compilers to process those cases in unacceptable fashion would make it necessary for programmers to add code to guard against those corner cases, thus negating any benefits that the "more limited" optimizations could have offered.

This code compiles:

let v: Vec<i32>;
loop {

Playground link

The example itself is contrived, but using non-terminating code to fulfill a type system obligation is not. It's common for it to be used on one side of a match {} block, in case one of the branches computes a value while another one enters an infinite loop or some other form of divergence (like panicking, or calling an await that never returns).

Since pulling a Vec out of your butt will never be type safe, and making this code fail to compile would be a violation of the backwards compatibility promise, even if LLVM offered a loop construct that behaved the way you described, Rust wouldn't be able to use it.

1 Like