Pointers Are Complicated, or: What's in a Byte?

Interesting post.

However, this simple model starts to fall apart once you consider pointer-integer casts. In miri, casting a pointer to an integer does not actually do anything, we now just have an integer variable whose value is a pointer (i.e., an allocation-offset pair). Multiplying that integer by 2 leads to an error, because it is entirely unclear what it means to multiply such a pair by 2. A full definition of a language like C++ or Rust of course cannot take this shortcut, it has to explain what really happens here

I don't understand why that is, or at least why it is a problem. If you cast a pointer to an integer, then the resulting value is an integer, now isn't it? Why would you care where it came from? After all, it's not a pointer anymore, you don't want to use it as a pointer – you want to use it as an integer, that's why you cast it away. Why would you need to specify (in a formal model) what pointer operations you can perform on an integer, or what integer operations you can do with a pointer? Probably none, they are distinct types…

However, this means that a simple operation like loading a byte from memory cannot just return a u8. What if that byte is part of a pointer?

Again, why would you care? If you wanted the pointer, you would have read the entire pointer, not only its first byte. If you want to copy a pointer, using memcpy for example, you would just… copy all of its bytes, and you could be sure it's the same pointer that you started with. I don't see the point in this, or why it would be useful.

3 Likes