I am not sure what exactly this is supposed to be a problem for.
If you take the address of a stack slot and pass that address to an unknown function, of course that function could do all sorts of hilarious things, including
memcpy or casting it to an integer, so you have to consider the address leaked – you have to do that anyway because the function could also put the address into a global variable. All of the analysis about which optimizations you can do that I did above was under the assumption that we see and know all the code where the stack slot and its address are used, so you can know that no conversion to integer is going on, by whatever means.
Yes, that is a problem. And fundamentally, I do not see a way to avoid it – casting (and maybe comparing) to an integer has big significance, because once it happens, the compiler no longer has any wiggle-room arguing about where the pointer may have escaped. These operations show that pointers literally are integers, and there is no such thing as a “secret integer that nobody knows about”.
I don’t follow. What “is okay”? Which operation would not “be okay”? /me is confused
To answer your question, “concretization” happens when a ptr-int-cast is executed. (In principle, it could be delayed, to make ptr-int-ptr-casts truly NOPs. Miri currently handles ptr-int-ptr casts without ever concretizing. But when the integer should be e.g. printed, there just is no alternative, we have to cough up an actual integer value.) Dead code never matters.
Miri, even with intptrcast in any of the variants I have mentioned, is fully “pointers are just memory locations”. There is no data being kept around about how the value was computed, only the result of the computation matters. I should clarify that I do not equate “memory location” and “integer” – in Miri, for example, a “memory location” is a pair of an allocation ID and an offset. You can cast arbitrary integer literals to pointers, but they will never, ever point to valid storage – essentially, the two namespaces of abstract and integer locations are disjoint. In intptrcast, the “integer namespace” of locations is populated on-demand, equating some range of integers with some abstract memory locations. However, not all integers correspond to an abstract memory location, and there can still be abstract memory locations that do not correspond to integers.
(Well, in principle one could go full-integer-only. But that would mean there is no such thing as a private memory location, breaking local reasoning about state and many, many optimizations.)
Right, so this is what miri currently implements – address arithmetic can never, ever move you from one allocation to another. We’d have to revisit this when intptrcast is implemented.
Absolutely, this is the kind of discussion I was hoping we could have, and it far exceeds all expectations . <3
Funny enough, much of the discussion here is not even about the topic of the thread (Types as Contracts), but about lower-level issues concerning pointer representation and casts to integers. But given that I am suggesting to take a formal take on the higher levels, it seems like a prerequisite to also have a formal idea of the lower levels – we can’t build on sand, after all.
It’s certainly related. The difference is that in this example, we know how both values are computed (they are both freshly allocated), whereas in your example, there was an old value flowing in and a new fresh value. The former provides more information and thus should be easier to handle/optimize.
Now you are surprising me again. Didn’t you write further above, in the same post, that pointer equality not being stable was “at least as surprising as a (tame) provenance model”?