Pointers Are Complicated II, or: We need better language specs

flatffinger · January 3, 2021, 4:39am

So, I think the best thing we can do with this part of the spec is ignore it. Instead, someone should work out a spec for restrict that is actually precise and unambiguous and enables the kinds of optimizations compilers want to do. I once got started on it but then other, more interesting projects came along.

Defining restrict would not be difficult if one recognizes a three-way partition "definitely based upon", "definitely not based upon", and "at least potentially based upon", and is willing to accept that some things will fall into the third group and won't be optimizable. Saying that the result of an expression which form pointers from a pointer are based/not based/potentially-based upon the original,. and those which form a pointer from something that isn't a pointer are potentially based upon any pointer that may have leaked, would be simple, and would enable 90% of the useful optimizations that restrict could allow.

I am not sure what this has to do with my blog post, though

The only rationale I can see for a compiler ever judging the clang/gcc aliasing assumptions as appropriate would be if they were trying to aggressively implement the C Standard's semantics for "restrict". That is the only situation where the combining facts that (p is known not to alias r) and (p and q have the same address) could justify an assumption that q can't alias r. I suppose given the possibilities that:

the auhors of both gcc and clang+llvm implemented logic to process aliasing assumptions for restrict in a way which is contrary to what the authors intended, but would at least be justifiable, or
the authors of both gcc and clang+llvm thought it appropriate to invent and impose language rules which are directly contrary to what the Standard specifies, and not allow any way to disable them except by disabling all optimizations altogether.

I'll admit I think #2 is actually more likely, but would find such conduct on the part of a compiler development team as a sign that the team isn't interested in producing a sound compiler and nothing they do should be trusted. The only way I could see the clang/gcc misbehavior as an "honest mistake" would be if the authors were trying to implement something that was in the language spec.

Basically, my sticking to restrict is me bending over backward to view the behavior of gcc and clang+llvm in a light that doesn't present them as untrustworthy garbage. I'll readily admit that it's tenuous, but since many people don't want to view gcc and clang+llvm as untrustworthy garbage, I'm bending over backward to give them the benefit of the doubt. If there's some other way you see that their behavior could be viewed as an "honest mistake" I'd be glad to hear it.

It can easily be combined with [this approach to ptr-to-int casts]

Approaches involving tracking provenance through numbers would seem needlessly complex and dangerous. I would think the most practical approach would be to recognize a category of actions that "leak" pointers, and a category of actions that "synthesize" pointers, and say that every pointer that is synthesized is "at least potentially" based upon any and all pointers that have been leaked. In cases where one needs to have a type that can either hold a number or a pointer that will be converted back to an actual pointer without doing any arithmetic upon it, it might be useful to have an "integer or pointer" type which can be used as either a pointer or integer without the conversions being regarded as leakage and synthesis, or have an opt-in mode which would treat uintptr_t like that for use with code that was known not to use the type for any other purpose, but in most cases integer-to-pointer conversions tend to be rather rare outside situations that involve doing things that compilers shouldn't expect to fully understand, and which they should thus refrain from trying to optimize too aggressively.

Topic		Replies	Views
Understanding of rust Raw pointers libs	4	1005	February 9, 2023
Function pointers are inconsistent with other language features language design	2	1694	August 25, 2020
C++ "Lifetime Profile 1.0", a.k.a. C++ might get a sort of borrow checker	8	5223	March 25, 2019
Int2ptr and runtime provenance models Unsafe Code Guidelines	24	2829	February 12, 2022
Dyn pointer equality	6	888	April 10, 2020

Pointers Are Complicated II, or: We need better language specs

Related topics