Pointers Are Complicated II, or: We need better language specs

Tour de force, Ralf. A triumph of sanity, your responses, thanks a lot.

I wish I could be more familiar with the area. At risk of imposing on your precious time yet again.. if I asked for a single article or book title would anything come to your mind?

1 Like

Mine experience with high-level languages is more along the lines of "That XX optimization option is only intended for use with programs that don't do YY. Do ___ to disable that optimization and your program should compile correctly."

For LLVM, a comparable response would be to suggest the front-end language compiler produce code that would block the optimization in question, such as by having all integer-to-pointer casts generate a memory-clobber instruction. The front-end compiler should then include an option to enable any such extra barriers that may be needed to guarantee correct code generation, unless the user specifies particular cases where such barriers would not be needed.

Unfortunately, many of the free language tools bundled with Linux don't include any optimization setting between "produce code that is more than twice as as big and slow as optimal" and "produce code that may fail under unspecified circumstances". If there were an optimization option which would identify automatic-duration objects objects whose address wasn't taken and hold them in registers when possible, and refrain from doing stack setup/teardown on leaf functions whose arguments are received in registers and can be kept there, such an optimization option would allow many functions to be written in such a way as to achieve a level of efficiency which would be adequate for most purposes, without compilers having to worry about the kind of weird corner cases that are hard to handle correctly. Further, in many cases, a compiler that performed all possible optimizations of this sort would achieve better performance than existing compilers whose creators expend their efforts chasing the more complicated optimizations that are harder to get right.

Note also that in many cases, a compiler with options to performs optimizations that aren't allowed by language standards, may be able to achieve better performance than would be achievable without such options, and one to support constructs beyond those mandated by language standards may, if given code which exploits those constructs, be able to achieve better performance than would be achievable without such constructs.

That will only be true when there is a "machine-readable" way of fully specifying requirements and non-requirements. In many real-world scenarios, it is simple to formulate a machine-readable specification for a sequence of operations that would produce acceptable output, but impractical to produce a machine-readable means of enumerating all acceptable outputs for any given input.

Many practical tasks will be supportable by some implementations but not others. If a program needs to do perform some task which some implementations will be able to support efficiently and others won't, there is nothing wrong with having it specify that it is suitable for use only on implementations with certain characteristics. Ideally, there should be machine-readable ways of specifying such requirements, but if no such means exist there is nothing wrong with programs that specify them only in human-readable form.

If, after including a vendor-supplied header, I write the C code [in reality, I'd be prone to define macros for the hard-coded values]:

PTA->PDDR |= 0x10;
PTA->PSOR = 0x10;
PORTA->PCR[4] = 0x146;

then on some C implementations that would turn on a green light. Nothing in the C Language Standard specifies that the above code should turn on a green light, but nothing in the C Language Specification specifies any way of turning on a green light. Some people might say that trying to use a C language program to turn on a green light is an abuse of the language, but I'd prefer to say that the behavior of the code would be specified as turning on a green light if it's run with interrupts disabled on a Freescale KL02 family controller which has a green LED wired via resistor between port pin A4 and ground. When run on other implementations it might do something else.

A programmer will know what possible outputs and behavior from a program will meet requirements and which ones won't, even in cases where it would not be practical to specify a means of enumerating all acceptable outputs. If it is possible to determine that all possible outputs or behaviors that could be produced by a program or subsection will have some traits, and that all outputs or behaviors with those traits satisfy requirements, that will prove that all possible outputs or behaviors will meet requirements. If a transform might affect the behavior of a program, but applying it to any program whose output would possess all required traits would yield a program whose output would possess all required traits, then applying the transform would not affect correctness.

In most cases, nearly all of the transforms a compiler would be allowed to perform would either have no effect upon a function's behavior given any possible inputs, or could be easily shown to always behave in a manner possessing some traits when given input possessing some traits. In some rare cases it would be necessary to enumerate possible behaviors, but in most cases invariants would suffice.

If, for example, a function will be fed a bunch of files, some of which are JPEG images and some of which aren't, and is required to produce for each file a number which will be the X dimension if the file was a valid JPEG file, and may be any arbitrary numeric value if the file is anything else. If one had a function that met requirements, and after having computed immutable values x, y, z and xy = x*y, such that for any valid JPEG image the mathematical product x*y*z would within integer range, its last step was to return xy*z/x, it would be simple to demonstrate that the function produced by replacing the last step with returning y*z would meet requirements without having to enumerate all of the possible return values. In all cases where the return value would matter, y*z would yield the same value as xy*z/x, and in all cases where those expressions would yield different values, all numeric results would be equally acceptable. Showing that both of those conditions apply would be much easier than trying to identify all of the cases where the expressions would yield different values.

It would specify conditions under which operations may or may not be reordered. Those conditions would generally be drawn such that no possible reordering could affect code which abided by the normal aliasing rules. The changes would be:

  1. The programmer would not be required to show that conflicts were impossible in cases where it would be easier to show that allowable reorderings could not affect program behavior. As a simple example, suppose a programmer knows that either p and q will identify disjoint storage or they will match, with the latter situation occurring extremely rarely, and a programmer wants to copy p to q. If the copy function's behavior is specified as reading all elements of p and writing all elements of q, with reads and writes performed in arbitrary sequence except that no particular element of q will be written before the corresponding element of p is read, such reordering would not affect behavior if p and q are equal, even though it would affect behavior if they were not equal but overlapped.

  2. In cases where the normal rules would make difficult for a programmer to accomplish what must be done, it would be possible for a programmer to achieve the required semantics by adding an explicit barrier. If the basic rules are written in terms of permissible reorderings, and one of the conditions is a lack of certain kinds of barriers between operations to be reordered, the effect of barriers on available program semantics would be clear.

Consider the design of Java's hashCode, which starts by reading an unguarded zero-initialized word of storage associated with the string. If it's non-zero, the value is returned. Otherwise, the function computes the hash code, stores it, and returns it. If hashCode gets called on another thread, it's possible that the other thread will see the stored value and return it immediately, or it might re-compute the hash code value itself. The former situation would be more efficient, but either would be acceptable.

Suppose an immutable collection type were to apply similar logic, and its behavioral specification stated that if its hashCode function is called one or more times, each item's hashCode function will be likewise called one or more times. If an item's hashCode function has a side effect which is observable, but having that side effect occur twice would be just as acceptable as having it only occur once, behavior would be affected by whether the first thread's write of hash was seen by the second thread, but that wouldn't make the program "incorrect".

In C, given something like f()+g(), the execution order of f() and g() is unspecified. This gives a compiler more flexibility if its goal is simply to produce a correct execution (e.g. if one of them simply returns the value of a static object, it may be faster to call the other and then use an "add" instruction than to add the value of the static object, than to read the value of the static object, save it somewhere, call the other function, and then have to recall the retrieved value and add it. If one is trying to prove that all correct executions would have defined behavior, however, that would be required that one treat the behavior as non-deterministic.

Consider, for example:

static int x;
int silly1(int y) { return x+=y; }

int silly(void)
{
  x =1;
  return silly1(1) + silly1(2) + silly1(3);
}

If one needs a function that sets x to 4 and returns some number between 10 and 20, with all such values being equally acceptable, would test be a correct function to do so? What would a tool like Miri have to do to determine whether the behavior would be defined on all correct implementations?

Then these "traits" are the semantics of your program. You want a large amount of non-determinism in it? Okay, sure. Then you want transforms that will not change this semantics. Certainly. Boring as yawn. Every computer scientist treats compilers exactly in this manner.

Different sets of optimizations allowed for different programs? So what? Those programs then will be written in different dialects of the language, each with its own semantics but sharing same syntax.

You want to put some directives into your source code to enable/disable optimizations? Then treat those directives as part of your language and say that the semantics of a code fragment depends on the directives.

You don't like GCC/clang and unsafe Rust defining some programs as not having any semantics at all, as broken, as having UB? Okay, you want a different semantics, possibly controlled via command line switches. And? Not a reason to throw a tantrum!

From my (limited) knowledge, this optimization already follows from the compiler trying to minimize the number of values spilled onto the stack when allocating registers, along with function inlining and codegen not generating things it doesn't need to.

There already exists such a compiler option in some compilers: -ffast-math. It does improve performance of floating-point operations as promised, but it also breaks code which relies on floating-point accuracy and reproducibility. And it also breaks attempts to detect errant conditions resulting from these changes, as the transformations that -ffast-math authorizes assumes that such errant conditions never happen. For many individual operations the tradeoff is acceptable, but as a compiler option it's too coarse grained to use effectively in most cases. And I am pretty sure other similar flags will have the same problem.

1 Like

My point is that a good compiler should have a mode that can do those things while generating reliably-correct code, without having to enable "optimizations" that may lead to erroneous code under unspecified circumstances.

Indeed. Compilers don't need permission from a standard to offer optional non-conforming modes, and there should thus be no need for compilers to try to treat the standard as an excuse to apply hyper-aggressive optimizations which may be useful for some purposes but totally unsuitable for others. What's important is for compilers to offer modes that perform "safe" optimizations while disabling dodgier ones.

The "traits" may be chosen to be less restrictive on inputs, or offer stronger guarantees on output, than would be needed to meet application requirements, so as to facilitate inductive proofs of correctness. Further, when evaluating a sequence f(g(x)), it may be possible to replace f() with a function that imposes stronger preconditions than the original f(), provided they would be satisfied by postconditions of the actual g() for all possible inputs, or replace g() with a function whose postcondition-guarantees are weaker than those of the original g(), provided that they satisfy all the preconditions needed fort the actual f() to meet application requirements.

My point is that in many cases, while one could validate a program by enumerating all possible outputs that could be produced at each stage of the program, and confirming that they are a subset of the possible inputs that could be accommodated by the next stage, it would in many cases be easier to show that the set of possible outputs from the first stage is a subset of some easily-described set, and that the second stage will behave in a manner meeting application requirements for all members of that same easily-described set.

Although proving the correctness of some programs would entail a combinatorial explosion of cases to be examined, in many practical cases it would not. When using an UB-based analysis, it will often be necessary to analyze in detail everything a program might do in response to invalid data, to ensure that no stage in processing could ever invoke UB. If the only thing a function would have to do to meet application requirements in cases upon receipt of invalid data is to either hang [and eventually get killed by a supervisory process] or return some arbitrary number without side effects, however, that may allow it to impose much weaker preconditions on its inputs in those cases than would be necessary under an UB-based model.

Correct. The easiest way to make a language be "simplewhile effectively supporting a wide range of tasks is to use it as a core for a wide family of dialects, each tailored to best suit its intended purposes. For this to work, however, it's important to recognize support for features beyond the core as a "quality of implementation" issue, and have implementations that claim to be suitable for various purposes make a bona fide effort to support features that would be useful for those purposes without regard for whether the core language spec would require them to do so.

Doing so would require that the maintainers of the language spec accept the inclusion of features that are not going to be universally supportable. Personally, I think a good spec for a language that is intended to accomplish a wide range of tasks on a wide range of platforms should recognize a very wide range of optional features with the semantics that implementations may reject any program for any reason they see fit (the ability of an implementation to accept any particular program would be a quality-of-implementation issue), but that if an otherwise-conforming implementation accepts a program, failure to process it as required by the language spec will make the implementation non-conforming.

Implementations would not be required to uphold any optional semantic guarantees that they feel would be too difficult to support, but would be required to reject any programs that specify that they need such semantic guarantees. Programs would be required to specify any semantic guarantees they need beyond those mandated by the core language, and would be rejected by implementations that don't support those guarantees, but would be guaranteed to run correctly on any implementations that accept them.

The authors of the Standard have explicitly stated that they did not wish to demean useful programs that happened not to be portable. When the Standard says that a program is "non-portable or erroneous", that does not exclude the possibility that it may be non-portable but correct. For compiler writers to cite the Standard as proof that a program which had been processed in the same useful fashion by all previous implementations is "broken" is a gross misrepresentation of what the authors of the Standard actually said.

Closed this topic because it is continuing to go in circles over the same arguments.

9 Likes