Pointers Are Complicated II, or: We need better language specs

Well, by the same token, for pointer variables p and q, you can replace p with if p == q { q } else { p }… or so LLVM and other compilers assumed, but in fact this breaks provenance and needs to be banned.

In theory, one could also ban the integer version, or at least force it to attach explicit provenance tags to the output, e.g.

if x == 0 { copy_provenance(0, x) } else { copy_provenance(1, x) }

More generally…

I don't know much about information-flow control, but AIUI it's focused on keeping information away from an attacker, who can observe any difference in program output regardless of whether the difference comes from a control dependency or a data dependency. Thus the two types of dependencies must be considered equally bad.

On the other hand, in a provenance semantics, you can require the programmer to be more upfront with the compiler and carry provenance only through data dependencies, which can be captured by observing a single program execution. You do have to forbid transforming data dependencies into control dependencies, but again, the "provenance for pointers only" model already requires forbidding that for pointers.

In theory, it should also be possible by observing a single execution to learn which data dependencies are trivial, like x - x. I suppose the abstract machine would have to associate each value with the entire formula used to compute it, and then a dependency on a given input is lost if evaluating that formula returns the same numeric output for all possible numeric values of that input. That way you account for all possible optimizations, not just the currently-implemented ones. Not exactly something Miri could implement, but the abstract machine could, and Miri could at least approximate it.

I'm not saying any of this is a good idea. Just that it's possible. :slight_smile:

1 Like