Pointers Are Complicated II, or: We need better language specs

I think it is much harder than you make it sound here. You seem to think of the analysis that the compiler has to do to figure out which aliasing assumptions it can make -- that analysis would involve things like "might be based on". But that's not a definition. The analysis has to be proven correct against the definition.

This is a definition of aliasing rules for Rust references. Note the total absence of the term "based on" in that paper -- "based on" is an emergent property that arises in the compiler, it is not fundamental. The same would be true for a precise spec of restrict.

There's no restrict in my blog post, so I really don't understand what your statement here has to do with my question.

Indeed, and that's why that approach I quoted does not involve tracking provenance through non-pointer-types. So your response seems to have nothing to do with my quoted statement.

Again you seem to be thinking of the analysis the compiler has to make. That analysis must be derived from and justified against the underlying definition of the operational semantics, and it is that underlying definition that is -- or was supposed to be -- the main subject of this thread. For example, in the Coq appendix of this paper, we define an analysis that has a notion of "escaped pointer". But that notion does not appear anywhere in the specification or definition! It is a useful tool for analyses, and the fact that it is useful is verified by proving theorems about it. The notion of "escaped pointers" follows from the definition, but it is not part of that definition.

This thread seems to have long gone past the end of its usefulness, TBH.^^ I will probably stop responding, unless some new points are being made.

6 Likes

I think it is much harder than you make it sound here. You seem to think of the analysis that the compiler has to do to figure out which aliasing assumptions it can make -- that analysis would involve things like "might be based on". But that's not a definition . The analysis has to be proven correct against the definition .

My long-standing point, which I guess I haven't made adequately clear, is that language rules should be defined in terms of a set of core behaviors, plus a specification of the analysis and transformations that compilers may perform. If a language defines the set of allowable behaviors for a piece of code as being the set of behaviors that could be result from processing that code according to a certain recipe, then any implementation that follows that recipe (performing allowable transforms, or not, at its leisure) will, by definition, be correct. Further, if it can be shown that all possible behaviors that would result from applying the recipe to a program would meet application requirements, that would imply that the program is correct.

To be sure, in most situations, the simplest way to write a correct program would be to do things in such a way that at each step all allowable ways of processing a program would yield the same behavior. Such situations could be accommodated by language rules which forbid any actions where applying different sequences of transformations might result in a program behaving differently. On the other hand, for most useful sets of transformations it would be simple to specify a set of guidelines which, if followed, would ensure that a program would have only one behaviors. One could thus specify two ways of writing a correct program: follow the guidelines without regard for the tricky details underneath, in cases where the resulting semantics would be sufficient for the task at hand, or else identify all of the behaviors that could be produced by allowable transformations and either ensure that they meet requirements, or adjust the program in such a way as to block any particular transformations that would otherwise prove problematic.

With regard to what constitutes a "definition", I'll concede the phrase "might be based upon" is informal; more precise would be "that satisfies conditions requiring that compilers recognize it as being potentially at least based upon". Most such conditions would be specified explicitly, except for the concepts of "leaks", discussed below.

For example, in the Coq appendix of this paper , we define an analysis that has a notion of "escaped pointer". But that notion does not appear anywhere in the specification or definition! It is a useful tool for analyses, and the fact that it is useful is verified by proving theorems about it. The notion of "escaped pointers" follows from the definition, but it is not part of that definition.

What's wrong with saying that a pointer "leaks" in cases where a compiler would be unable to prove that constructs intended to do weird things the compiler isn't likely to fully understand (e.g. inspecting pointer representations or accesses to volatile-qualified objects) don't use the value of it or a pointer derived from it could not have been used in a way that that would allow a pointer of unknown provenance to be created with its value, without relying upon it being linearly related to any other pointers? The only time it would matter whether a pointer leaks is when deciding whether it might alias a pointer of unknown provenance. If a compiler can correctly prove that a pointer hasn't leaked prior to some moment in time, then it may safely assume that it will not alias any pointers of unknown provenance which existed at that time.

Although this definition of "leak" may generate a substantial number of false positives, none of them could cause a compiler to generate incorrect code, and most of them would occur in situations where they wouldn't affect optimization too badly. Although it would be impossible for a compiler to identify the precise set of cases where code uses a pointer's value in a way that would allow creation of an identical but seemingly-unrelated pointer, all that is necessary for effective optimization is that it recognize a useful fraction of cases where operations definitely do not leak pointers.

Your point has long been made more than abundantly clear. You just haven't found much support for your goal in redefinition of language because most other people that the approach is inherently unworkable, and we have apparently been ineffective in convincing you of these issues.

To echo what RalfJung says, it seems clear to me now that there is no point to continuing discussion of this topic, since I don't think anyone will be persuaded by what you or anyone else says at this point.

1 Like

This is roughly equivalent to saying that the language definition should define LLVM, and that LLVM isn't allowed to get better at optimizing $lang without a new publication of the $lang standard (as the standard defines all allowable optimization).

The entire academic branch studying compiler optimization is roughly in agreement that specifying allowable transformations is intractable, and specifying operational semantics (and proving transforms don't change said semantics) is a much more tractable approach.

You are of course allowed to disagree, but until you provide artifacts actually showing how your approach is better than state of art, nobody is going to be much convinced by theoretical arguments.

We're typically also discussing specification moreso than specific implementation (when discussing unsafe semantics) here. Specific implementations are of course allowed to give operations that are operationally not preserved specific meaning on certain platforms. But specification in terms of operational semantics allows difference in implementation; specifying allowed transforms does the opposite.

11 Likes

This is roughly equivalent to saying that the language definition should define LLVM, and that LLVM isn't allowed to get better at optimizing $lang without a new publication of the $lang standard (as the standard defines all allowable optimization).

Reliably determining whether applying any particular transform to a program will affect its behavior in any defined cases is generally an intractable problem, especially the way language rules are written. Is there any reason to believe that the problem is any closer to being genuinely solved now than it was five or ten years ago, or that applying twenty more years of research to that approach won't result in it still being "almost solved" twenty years from now?

The vast majority of bugs I've seen in clang and LLVM would have been avoided if the maintainers adopted a simple principle: if a compiler can't understand and track everything necessary to 100% prove that an optimization is safe, don't do it. Unfortunately, the way language specs are written makes it impossible to 100% prove the safety of many optimizations that would in practice be safe and useful. This forces compiler writers to choose between foregoing optimizations that would be 99% likely to be safe, or performing transforms that will occasionally and unpredictably break valid programs. Personally, I think compilers whose authors would favor the latter approach should be regarded as untrustworthy, but the real solution would be to specify optimization-promoting rules in a way that would allow both compiler writers and users to more easily identify when an optimization might change the behavior of a program whose behavior would have been defined in the absence of such rules.

There aren't really a whole lot of things that optimizers can usefully do that would have the potential to alter program behavior. The vast majority of such optimizations involve reordering, consolidating, splitting, and omitting operations, transforming sequences of arithmetic operations into equivalents, and substituting convenient values for unspecified or indeterminate values. A compiler that could apply all such transformations in the most efficient way that would meet program requirements would blow the socks off any existing compiler even if it the former compiler were required to process all non-critical forms of UB that could result in a transformed program "in a documented fashion characteristic from the environment".

If what a compiler needs to know is whether it may safely reorder some particular read of *p across a write to *q, and the underlying platform would define behavior for both sequences which might in some cases be usefully different, having a language rule specify in what cases a compiler would or would not be allowed to reorder the write would make things much easier for programmers and compiler writers alike than rules which would attempt to characterize as UB any situations where reordering the operations might affect program behavior. For a compiler to try to take rules of the latter form and determine from them whether two operations can be reordered is much harder and more error-prone than having it evaluate rules which would directly specify when compilers must treat operations as sequenced and when they wouldn't need to.

I'd argue that for code authors, proving code does not do any undefined operation is a lot easier, especially when using a language like Rust that makes it locally reasonable and tools like Miri which implement the abstract machine precisely and can tell you objectively that you did an illegal operation.

Showing yourself correct under every allowed transform is a combinatoric problem that gets intractable at a stupid rate.

But I agree that this isn't going anywhere, so I'm done. Since your approach is apparently better than LLVM's, I look forward to seeing a competing optimizing backend in the future! Consider writing it in Rust.

5 Likes

So language rules need to be written better, preferably formalized in automated systems like Coq, right? For Rust which doesn't have a formal spec so far there is still an opportunity to do that.

I'd love to learn how to do things like that but for myself there's hell lot of learning still left to do :frowning: I wish there were sensible MOOC courses for me to take and/or I had a mentor to guide me..

Can't put my finger on it but something here sounds backwards.. Are there any such rules in existence? Saying "this program has UB if reordering this write before that read changes program behavior"? Don't think any document on how programs written in C or C++ or Rust behave is ever saying that?

I've recently heard [with interest] about at least one person on dev channels looking into starting a project along these lines. The tl;dr being that it was surprisingly difficult for them to make progress. Benefit of the doubt saying they knew what they were getting into.

It might be something to start a rewrite the Rust compiler in OCaml. That would certainly be more direct relative to an approach using Coq.

Either way, with my hand wavy 3 am estimation skills, a full-blown theorem solver of rustc would cost on the order of decades of focused person time to approach something usable by a small subset of compiler engineers and related experts.

Are there any such rules in existence?

That's the primary purpose of aliasing rules. Given something like:

int test(int *restrict a, int *restrict b)
{
  if (*a)
    *b = 2;
  return *a;
}

the aliasing would allow the compiler to reorder the second read of *a across the write to *b (the effects of which could be observable in the absence of the aliasing rules), thus allowing it to be consolidated with the first read (the consolidation not being otherwise observable unless *a fails to behave like ordinary storage).

Additionally, the primary useful function of rules that allow compilers to assume that loops without side effects will terminate is to allow operations which follow such loops to be reordered to occur before them, and the primary optimization benefit of rules that characterize race conditions as UB (rather than simply yielding Unspecified bit patterns on read, or setting objects to Unspecified bit patterns on write) is to allow compilers to reorder or split accesses in ways that could be observable if such actions weren't regarded as UB.

Such reorderings are often safe and useful, but there are times when programmers need to have operations performed in ways that uphold a certain sequencing guarantees. If a language specified the conditions under which compilers must regard operations as sequenced in the absence of particular exemptions, along with the specific exemptions and counter-exemptions, programmers who need operations to be performed in a particular sequence could ensure that the code was written in such a way as to ensure that, perhaps via the addition of directives whose purpose is to prevent reordering. If the language were to specify directives that would explicitly block reordering, along with a directive to indicate that places that are unusually sensitive to ordering are marked [somewhat analogous to the "use strict" dialect of Javascript] then code written in the latter dialect could be optimized much more safely and aggressively than in the present dialect, and the present dialect of the language (without the sequence-forming directives and a directive indicating that all sequence-sensitive places are marked) could be deprecated, while being able to process code in the existing language more reliably than would be possible if compilers tried to aggressively optimize everything rather than just code in the aggressively optimizable dialect.

My understanding is that rustc originally was written in OCaml.

1 Like

Was it somewhere on rust-lang.zulipchat.com?

But the rule for restrict is not "the program is UB if reordering changes program behavior". We do unfortunately not have very precise rules for restrict, but very similar reorderings are possible with Stacked Borrows, and as you can see there, UB is defined purely operationally in terms of an Abstract Machine and how it evolves for each step. The fact that reordering does not change program behavior (of UB-free programs) is a theorem that can be proven from these definitions, it is not the definition itself.

2 Likes

But the rule for restrict is not "the program is UB if reordering changes program behavior". We do unfortunately not have very precise rules for restrict , but very similar reorderings are possible with Stacked Borrows, and as you can see there, UB is defined purely operationally in terms of an Abstract Machine and how it evolves for each step. The fact that reordering does not change program behavior (of UB-free programs) is a theorem that can be proven from these definitions , it is not the definition itself.

The purpose of the rules for restrict was to compilers permission to perform some transformations despite the fact that they may cause the generated machine code to behave in ways that wouldn't be allowable in the absence of "restrict". It did this by characterizing as UB situations where such transforms might affect program behavior.

Although the stacked-borrow model for Rust is good for many purposes, some tasks require semantics beyond what it can provide. If a language abstraction model specifies that if a write occurs within the lifetime of an unrelated read reference, any reads of that reference may arbitrarily yield any value the object has held or will hold during the lifetime of the read reference, and that at any time throughout the lifetime of a reference that will ever be used to write an object, an implementation may behave as though it reads the object (yielding any value that would be possible given the aforementioned freedom) and writes its value back, that would allow nearly all of the optimizations that would be possible if the behavior were treated as UB, but would also allow something like:

unsigned test(unsigned *restrict r, unsigned mask)
{
  unsigned temp;
  if (mask) temp = *r;    
  if (doSomething()) temp++;
  if (mask) return temp & mask; else return 0;
}

to be replaced with:

unsigned test(unsigned *restrict r, unsigned mask)
{
  unsigned temp = *r;
  if (doSomething()) temp++;
  return temp & mask;
}

in cases where r is known to point to an object of type unsigned, but where its target might get changed within doSomething(). If the effect of the conflicting read would be limited to yielding a possibly meaningless value, the fact that the value in temp will end up being ignored if mask was zero would mean that a compiler wouldn't be required to skip the read in that case. If the effect of the conflicting read were not thus limited, however, then that substitution could alter program behavior if mask was zero, and doSomething() modified the object to which r points and returned zero since the code would have defined behavior (call doSomething and return zero) in that case, but the transformed program would invoke UB.

And the way one knows if that purpose has been achieved is by proving theorems about the operational semantics that have been defined. But the operational semantics form the ground truth.

It is very easy to define transformations ("purposes") that are simply unachievable. So the desired transformations can only form a starting point, a set of design constraints that the operational semantics is to be evaluated against. Putting it differently, these transformation/"purposes" are a wish list, but not every wish can come true, in particular not all wishes at the same time. That's why we need operational semantics; they are the only tool that has been proposed so far that is able to make sure that everything is consistent.

We have now gone around this circle several times, and you are clearly unwilling to accept the consensus of not just most people discussing here, but also the people that professionally deal with formal semantics or compiler construction. You have not pointed out a single fundamental flaw in the consensus approach, but instead kept moving goalposts by picking apart parts of the C standard, without relating this to the underlying argument. So let's just stop this.

This is the first time that the IRLO discussion for one of my blog post was less useful than the one on Reddit or hackernews. I hope this is not a sign for this forum in general.

14 Likes

I have found, in life, that I learn more from being proven wrong than proven right. Being "proven wrong", however, does not mean finding that people disagree with my conclusions, but rather identifying ways in which either my premises or supporting arguments are faulty.

I have tried to lay out my premises and arguments in such a way that if there any faults, they could be identified. It does not appear that is going to happen. Perhaps the premises and arguments are correct, or perhaps they are faulty in ways that have not been identified here, but I would agree that we're stuck at an impasse where we agree about premises and some intermediate conclusions, but disagree about other intermediate conclusions, without being able to find any particular fault in the logic connecting them, and further discussion is unlikely to change that.

Thank you for your time. -- John

This reminds me of mathematical logic, in particular around the topic of axioms and defining models. If one would start with the allowed program transformations and just define UB in terms of: “These certain transformations are legal and if you can get weird outcomes from those the program is UB”, one would in a sense axiomatize UB. The problem with axiomatic systems like that is always: How do you tell that they aren’t contradictory. Or in particular: If you only have rules that say what UB is, how do you tell that a program isn’t UB; this is the kind of question you really really want to make as easy to answer as possible, as it’s the question of how to write programs that don’t UB, the question that every programmer using the language has to answer every time they use the language (or in cases like Rust, every time they use unsafe Rust).

On the other hand the question of which compiler optimization are legal is one which only compiler writers have to worry about and also only once for each new optimization rule.

While such an axiomatization of UB around rewriting rules may make the lifes easier for compiler writers, I disagree that it “would make things much easier for programmers”. Usually in mathematics how you can show that a set of axioms for, say, an algebraic structure aren’t contradictory happens by finding/defining a “model” that satisfies the axioms. The analogy would be that while your reordering rules are the axioms, operational semantics could be the model.

I’m saying: The easiest way to find out for sure that a characterization by reordering rules wouldn’t suddently make pretty much all programs UB—without anyone noticing that the rules actually entail this—might very well be by finding some (e.g. operational) semantics that define your language. One semantics definition which then must be shown to actually support these reorderings. Of course after the reordering are shown to be derivable from that semantics, it makes absolutely no sense anymore to keep the reordering rules be the language specification. Instead the semantics should be the language specification, and then people could perhaps even make up new valid optimizations that no-one thought of before. [Of course your original set of reordering rules could still be included in a specification document as theorems, for demonstration purposes or to help future compiler writers.]

And operational semantics also often mirror the way one thinks intuitively what a program does. As opposed to statement-reordering in compiler optimizations—those are usually not how a programmer reasons about their program. If the operational semantics fully describe whether a given program execution results in UB, then it is often “fairly straightforward”famous last words to find out whether a certain program is UB-free. And when you only wonder a particular program run on a few particular inputs, you can even try to fully automate the checks, e.g. something that miri aims to achieve.

If you would have to consider “whether certain reorderings of operations might affect program behavior” instead, you would exponentially increase the things you need to validate. You don’t only have to consider every possible program input, but additionally also every possible way to apply “reorderings” that are defined/axiomized to be legal in order to rule out UB from your programs.

9 Likes

Well said. Isn't this the trap that attempts to more or less formally describe how concurrent/multithreaded code behaves fell into? I vaguely remember the official formalizations of Java concurrency were written in terms of what re-orderings JVM/hardware is allowed to do?

1 Like

Or in particular: If you only have rules that say what UB is , how do you tell that a program isn’t UB; this is the kind of question you really really want to make as easy to answer as possible, as it’s the question of how to write programs that don’t UB, the question that every programmer using the language has to answer every time they use the language (or in cases like Rust, every time they use unsafe Rust).

By designing the set of allowable transformations such that if the inputs to a function meet certain constraints, the outputs will be precisely specified, and if the inputs meet looser constraints, the outputs will be less strongly specified but still meet some constraints. Program correctness could thus be demonstrated either by showing that the inputs meet the tighter constraints (and the downstream paths expect the precisely specified output), or that the downstream paths will be satisfied by the more loosely specified outputs that may result from the more loosely specified inputs.

If a language were to specify that integer computations which overflow generally use quiet two's-complement wraparound semantics, but also specified that a compiler may perform certain transfomrs such as replacing an expression which is of the form (a*N)/N [for positive integer N] with a, or (a*N)/(b*n) with a/b, one could demonstrate mathematically that such transformations could not affect program behavior in cases where none of the original expressions would overflow. In many cases, it might be easier to demonstrate that a computation like (a*60)/30 would never be performed when a is large enough to make a*60 overflow, and then assume that the expression will behave precisely as written, than to examine all of the ways a compiler might rewrite it (e.g. (a*10)/5, (a*4)/2, a*6/3, etc.) and confirm that all of them meet requirements.

On the other hand, there are situations where processing a mixture of meaningful and meaningless computations may be more efficient than trying to determine beforehand which computations will be meaningful. In such cases, even if overflows could occur in meaningless computations, a programmer could prove that the transforms won't affect program correctness by showing that overflows will never occur in meaningful computations, and that the only requirement for the meaningless computations is that they have no side effect beyond yielding a meaningless result (a condition which the transforms would not affect). That seems more practical, and would allow the generation of more efficient code, than having to write the code in a way that would prevent overflow from occurring in the meaningless expressions (which would in turn prevent the compiler from performing the above transformations).

If you would have to consider “whether certain reorderings of operations might affect program behavior” instead, you would exponentially increase the things you need to validate. You don’t only have to consider every possible program input, but additionally also every possible way to apply “reorderings” that are defined/axiomized to be legal in order to rule out UB from your programs.

In most cases, the easiest way for a programmer to ensure that allowable reorderings wouldn't prevent a program from meeting requirements would be to avoid any constructs where reorderings could have even localized effects on program behavior. On the other hand, it may sometimes be easier to show that the only things affect by a reordering will be values that will end up being ignored by the current program, or which--if they are included in a blob of binary data that gets output by the program, will be ignored by whatever programs are going to read that blob.

One semantics definition which then must be shown to actually support these reorderings. Of course after the reordering are shown to be derivable from that semantics, it makes absolutely no sense anymore to keep the reordering rules be the language specification.

A major problem here is that it is impossible to write a semantics specification which will regard as UB all of the corner cases where an allowable reordering would turn a program which would have met requirements into one that doesn't, without regarding as UB any situations where no allowable reordering would do so. Any practical set of rules will end up putting more restrictions on programmers and/or compilers than one which regards the transforms as axioms.

Well said. Isn't this the trap that attempts to more or less formally describe how concurrent/multithreaded code behaves fell into? I vaguely remember the official formalizations of Java concurrency were written in terms of what re-orderings JVM/hardware is allowed to do?

The optimal way for for programs to achieve required semantics will often vary depending upon what guarantees the underlying hardware platform can offer. Programs that are designed to exploit a particular hardware memory model can often be more efficient than programs that are written in hardware-agnostic fashion, but may be extremely difficult to port to hardware with a different model. I don't recall the JVM as being defined in terms of allowable transformations; I haven't looked at it in ages, but I recall it was defined in terms of non-transitive ordering relationships, and it of course avoided allowing programmers any abilities that wouldn't be universally supportable.

BTW, returning somewhat to the subject of the original post, I find myself wondering why high-performance computing would be done with a language which allows pointer arithmetic beyond "form non-persistable reference to sub-object", much less arbitrary conversions between integers and pointers. Support for more generalized pointer arithmetic is helpful if one is trying to use a simple compiler to generate efficient code for a processor with limited addressing modes, or if one needs to perform memory management within a language rather than using an outside library, but from what I understand, high-performance computing generally doesn't require doing any of those things. Allowing more generalized pointer arithmetic or--even worse--arbitrary conversions between integers and pointers, makes it impossible for the language to support optimization without adding a lot of complexity to both the language specification and to implementations. What benefit do high-performance computing applications actually receive in exchange for this complexity?

References in Safe Rust are slightly more powerful than that, yet a lot of people still need to use unsafe, so apparently it's not enough for them.

References in Safe Rust are slightly more powerful than that, yet a lot of people still need to use unsafe , so apparently it's not enough for them.

How often would it necessary to persist freely aliasable pointers to things other than top-level objects, if the language rules treated those specially, and programmers created top level objects for things that needed to be accessible via persistent pointers that could alias? Treating pointers to top-level objects specially could offer some really huge performance advantages even outside the greatly simplified aliasing model. For example, unless an application would need billions of objects of type T, a 64-bit implementation could place all objects of type T whose address is persisted into a contiguous region of address space, and then have a "packed pointer to T" be a 32-bit index into that array; the cache footprint of such an "packed pointer" would thus only be half the size of a 64-bit pointer.

If a and b are pointers to top-level object of array type, and may or may not identify the same array, then:

for i:=0 to 99 do
  a^[i] := a^[i] + b^[i];

could be safely transformed to use a down-counting loop rather than an up-counting loop. While it would be possible for a[i] and b[i] to identify the same object, there would be no way, for any positive integer N, for a[i] or b[i] to identify the same object as a[i+N] or b[i+N]. Such a transformation would not be safe, however, if a and b could point to different elements of the same array, whose indices differed by 99 or less.