Types as Contracts


#61

Even then, consider what I edited into my last post (maybe after you read it): memcpying the bytes of the pointer, as a backdoor to achieve inttoptr/ptrtoint without the frontend necessarily knowing about it. LLVM of course optimizes memcpy(&a, &b, sizeof(pointer)) into a = b, and ditto even for a manual byte-by-byte copy. And that optimization is extremely desirable for a number of reasons. I can’t think of any sane way to block it without massively inhibiting optimization.

An alternative would be passing all allocas (of variables to which references escape) through an opaque function, but that too would be really bad. Maybe some robust interprocedral analysis on the frontend could make it semi-OK. But it really doesn’t sound practical.


#62

Ah, of course! I even knew that Miri has “abstract pointers”, but somehow didn’t connect it to this.

The problem with that model is that it means casting a pointer to an integer has a big significance, which unsafe code authors do not expect at all. If we change your two snippets A and B to compare the pointers, without casting to integers (a distinction which I’ve glossed over before, precisely because I thought it unimportant), I would predict that nobody without a formal methods (or possibly C standard lawyering) background would expect that to make a difference.

Even I have a very strong feeling that there shouldn’t be a difference. And I have passing knowledge of some formal methods topics and in particular have heard of the intptrcast model and understand its justification. (And unlike with other questions, my compiler background shouldn’t cloud my judgement too much, because this difference doesn’t seem to matter for any analysis or optimization that I know/care about.)

That seems at least as surprising as a (tame) provenance model. In particular, when it comes to calculating the address of an object in a weird manner, the main difference I can see (vs. some notion of provenance rule) is. Suppose we have a really weird calculation that somehow results in the address of a object. When is this defined?

  • With the kind of rule you describe, it’s okay if and only if the object’s address was taken and turned into an integer. (Q: does that mean anywhere at all? Even in dead code? That seems… difficult to implement. IIRC we currently already strip some dead code during MIR construction.)
  • In contrast, picking LLVM’s rules for inttoptr as an example, it’s okay if the object’s address factored into the calculation in any way, “directly or indirectly”.

Neither option is fully compatible with “pointers are just memory locations”. However, in my opinion, the latter is equally easy to explain, more intuitive, and at all possible to implement efficiently on top of LLVM (but @comex already argued for that).

Also, I thought some more about the expectation about using ptr.wrapping_offset/GEP without inbounds to cross object boundaries. I would like (and claimed earlier in this thread that this is the case without checking) that ptr1.wrapping_offset(ptr2 - ptr1) is a perfectly fine way to (redundantly) compute ptr2, and have the resulting pointer Just Work. But in LLVM, it turns out you still can’t leave the object the base pointer refers to and jump to another one, at least as far as access is concerned (the lack of inbounds does mean, though, that the address calculation is not UB).

I guess this is rather easy to solve with no loss of precision in the common cases (unlike other mismatches with LLVM) by just lowering wrapping_offset into int<->ptr casts and arithmetic, instead of GEP.

Yes, I don’t want different storage categories for locals. Locals just made the example simpler and more poignant.

Yes, this is a big and important concern. OTOH, I also have concerns about whether LLVM is, or will be in the foreseeable future, compatible with the model you’re describing. That’s why I’m very glad that formal methods people and people with LLVM knowledge are meeting in threads like this :heart:

That seems pretty much the same issue, if we agree that fresh locals and fresh heap allocations should be treated similarly?

Yes, and this is a cost I am willing to accept, because I’ve yet to see or come up with a program that feels like it does something “legitimate” and useful with the kind of pointer comparisons that would be affected. (And yet, don’t want to make the comparison UB, because that seems more risky for no additional gain in optimization power.)


#63

I’d actually thought about a variant of this model in the original memory model RFC (https://github.com/rust-lang/rfcs/pull/1578) - search for “head reference” - it’s written in a “aliasing rules” sort of way instead of in a “validation” sort of way, but it should be equivalent.

IIRC my conclusion at the end was that we need both a “locking” component (to allow “speculating” accesses) and an “aliasing” component (to deal with LLVM), and I wasn’t sure whether the best way was to have 2 separate rules or some sort of unified rule. I think that we could integrate “locking at the beginning of a function” into a “provenance-data-race”-style ruleset.


#64

So I think that I explained this poorly. For one thing, calling things “unsafety levels” is probably wrong. It’s not that unchecked code is any “more safe” than unsafe code. Both unchecked and unsafe code can easily create UB. The distinction is rather than “unchecked” code is more optimizable – and it’s more optimizable precisely because it’s more restricted in what it can legally do (without creating UB)! In particular, an unchecked function is restricted to actions whose legality is implied by the signature of the function. Perhaps it’s better to think of these as “variants” of unsafe.

Also, I was wrong to say that there are no intrinsics that would be at the “unchecked” level – I think that the offset function is exactly such an intrinsic. It doesn’t access memory or require any additional locks, but if you fail to abide by its conditions – i.e., you give an offset that is outside the range of the current object – you create UB.

In any case, with respect to your objection: the distinction between unchecked and unsafe that I was trying to draw is not meaningless. I tried to state it very precisely. It has to do, ultimately, with what the requirements are on the caller!

In short, for both safe and unchecked functions, the range of memory that the function can access is the same: effectively, it includes whatever safe code could do using the arguments that are given. This means that if you have: fn deref(x: *mut T), and that function is either safe or unchecked, then accessing *x would potentially cause UB. If deref() were declared unsafe, then it would (effectively) be allowed to access whatever memory its caller is allowed to access; if that includes *x, we’re OK.

There is of course a good reason to draw this distinction: if you have code that is calling an (opaque) function f(..), and that function is declared as safe or unchecked, we don’t have to know anything about its body in order to limit what range of memory it may (legally) access. This allows us to optimize based on those assumptions. The same is not true of an unsafe fn. There, we have to come up with some rules about what range of memory the unsafe function is allowed to access.


#65

Yes, I recall us discussing something like this with @stoklund at the compiler sprint week. I think this is similar to what Ralf has been saying about intptrcast being orthogonal – that is, we can combine a locking-based rule with other rules, and say that accessing a given bit of memory is allowed if (a) the locks allow for it and (b) the other rules allow for it too.

At the work week, I recall us digging into the LLVM source in order to try and get to the bottom of what the “built-in” LLVM aliasing rules were, but it’s out-of-cache now. @rkruppe, can you provide a pointer to the LLVM source perhaps where I could read up?


#66

My go-to source is the language reference section, supplemented with experiments (write IR, run it through the optimizers). I’ve never had the need to look closely at the implementation, but a lot of it seems to live in BasicAliasAnalysis (but that also contains a lot of stuff related to specific attributes or intrinsics, which are opt-in).


#67

I am not sure what exactly this is supposed to be a problem for. :slight_smile: If you take the address of a stack slot and pass that address to an unknown function, of course that function could do all sorts of hilarious things, including memcpy or casting it to an integer, so you have to consider the address leaked – you have to do that anyway because the function could also put the address into a global variable. All of the analysis about which optimizations you can do that I did above was under the assumption that we see and know all the code where the stack slot and its address are used, so you can know that no conversion to integer is going on, by whatever means.

Yes, that is a problem. And fundamentally, I do not see a way to avoid it – casting (and maybe comparing) to an integer has big significance, because once it happens, the compiler no longer has any wiggle-room arguing about where the pointer may have escaped. These operations show that pointers literally are integers, and there is no such thing as a “secret integer that nobody knows about”.

I don’t follow. What “is okay”? Which operation would not “be okay”? /me is confused

To answer your question, “concretization” happens when a ptr-int-cast is executed. (In principle, it could be delayed, to make ptr-int-ptr-casts truly NOPs. Miri currently handles ptr-int-ptr casts without ever concretizing. But when the integer should be e.g. printed, there just is no alternative, we have to cough up an actual integer value.) Dead code never matters.

Miri, even with intptrcast in any of the variants I have mentioned, is fully “pointers are just memory locations”. There is no data being kept around about how the value was computed, only the result of the computation matters. I should clarify that I do not equate “memory location” and “integer” – in Miri, for example, a “memory location” is a pair of an allocation ID and an offset. You can cast arbitrary integer literals to pointers, but they will never, ever point to valid storage – essentially, the two namespaces of abstract and integer locations are disjoint. In intptrcast, the “integer namespace” of locations is populated on-demand, equating some range of integers with some abstract memory locations. However, not all integers correspond to an abstract memory location, and there can still be abstract memory locations that do not correspond to integers.

(Well, in principle one could go full-integer-only. But that would mean there is no such thing as a private memory location, breaking local reasoning about state and many, many optimizations.)

Right, so this is what miri currently implements – address arithmetic can never, ever move you from one allocation to another. We’d have to revisit this when intptrcast is implemented.

Absolutely, this is the kind of discussion I was hoping we could have, and it far exceeds all expectations :slight_smile: . <3

Funny enough, much of the discussion here is not even about the topic of the thread (Types as Contracts), but about lower-level issues concerning pointer representation and casts to integers. But given that I am suggesting to take a formal take on the higher levels, it seems like a prerequisite to also have a formal idea of the lower levels – we can’t build on sand, after all.

It’s certainly related. The difference is that in this example, we know how both values are computed (they are both freshly allocated), whereas in your example, there was an old value flowing in and a new fresh value. The former provides more information and thus should be easier to handle/optimize.

Now you are surprising me again. :slight_smile: Didn’t you write further above, in the same post, that pointer equality not being stable was “at least as surprising as a (tame) provenance model”?


#68

Accessing a local (or object, whatever) through a pointer that you created via some shady means (e.g., guessing an integer and casting it to a pointer, or doing wrapping_offset on pointer to another object that ought to be “nearby”, or similarly silly things). Under what conditions is this UB (“not okay”) or not UB (“okay”, this includes implementation-specified and unspecified behavior)?

Sounds good for a formal model. However, this is useless for optimizations, which will (and should) delete the ptr->int cast if the result is unused, or move it around. So for that use case, we’d have to somehow encode which objects are affected by pointer->int casts, during MIR construction (and carry that information over into LLVM IR generation). Presumably this would be based on a (very stupid, since it can’t rely on the optimizer to simplify the code first) static approximation of whether (and for which objects) a ptr->int cast is executed. That sounds somewhat unsatisfying, though it might also work out okay.

Yeah, after I wrote the part you quoted I realized that’s probably the case. I should not have blindly taken your slogan. I was talking about treating pointers as integers, because that’s what programmers do and expect. Anything which deviates from that, anything that might make pointers to be treated differently from the same pointer casted to an integer, regardless of how it is justified, is stupid compiler nonsense that gets in the way of systems programming. As we’ve seen two times in this thread alone, even the rule that ptr.offset can’t legally take you to another object catches people by surprise!

Now, obviously, some of this “stupid compiler nonsense” is well justified by the optimizations it enables (which programmers also intuitively expect) and by how rarely it breaks real code. But it’s still unexpected, regardless of the model justifying it.

I’m not sure what you mean. For the record, “as surprising as a tame provenance model” is not an indictment, as I am in favour of (some) provenance-ish rules. I was merely arguing that the kind of model you describe also is not obviously more intuitive for programmers than provenance rules.


#69

I… think I see. In my model, it doesn’t mater at all how the pointer was created. The final pointer has a value an (abstract or concrete) location; if that location is in-bounds and the access does not violate any locks, the access is okay.

Right, so that would be okay if we tune the intptrcast so that “concretization” is delayed – e.g., it’s not the ptr-int-cast that makes the pointer concrete, it’s the first time this resulting integer is actually used in an arithmetic operation.

Of course, that only shifts the problem – now turning x * 0 into 0 could turn a pointer from guessable to unguessable. We could hard-code this in the model to not concretize on multiplication with 0, but this won’t scale, it won’t cover everything, and it will get more and more complicated. So, probably not a good idea.

Aren’t you saying here that programmers have mutually contradicting expectations? In that case, we need to work not only on the model, but also on changing the exepctations.

And that’s despite the fact that this is explicitly documented. It’s not like in C, where you just write +.

I start to wonder whether we have to do user studies to figure out which model is least interrupting for programmers…

Just to be clear, we are talking about a model where pointer comparison is non-deterministic under some circumstances? I can certainly see that this is surprising. The only reason I prefer it over provenance is that it is easier to model. :wink: (Sometimes that’s actually a good argument – easier-to-model leads to correct-for-simpler-reasons – but I wouldn’t dare claiming that this is a universal property.)


So, if I could use you as an “why does LLVM what it does” oracle here :wink: , I would have another question as part of the freshness discussion we have had above. I have been trying to trick LLVM into what I think is an invalid optimization, and so far failed, but I do not understand why.

So consider https://godbolt.org/g/cg41qa :

// int_str_cmp compares two size_t in a way that hopefully the compiler does not understand.
// (It converts them to a string and compares the strings.)

static void test(size_t i1, size_t i2) {
    if (i1 == i2) {
        if (int_str_cmp(i1, i2) != 0) {
            // So the ints are equal but the strings are not? Impossible!
            *(volatile size_t*)NULL = 0;
        }
    } else {
        if (int_str_cmp(i1, i2) == 0) {
            // So the ints are inequal but the strings are equal? Impossible!
            *(volatile size_t*)NULL = 0;
        }
    }
}

void foo(int *xp) {
    int y = 0;
    int *yp = &y;
    test((size_t)xp, (size_t)yp);
}

So what I would expect to happen is that the compiler argues that xp and yp cannot possibly be equal, and so the first conditional in test always evaluates to false. So after inlining test into foo, it should just unconditionally execute the else branch of test. I was then going to argue why that is an incorrect optimization. (Basically, if we now call foo with every possible integer as argument, one of them will trigger the NULL pointer deref – but in the unoptimized program, that deref is unreachable.)

Now, I cannot read assembly very well, so I don’t actually know what the compiled code does. But if I hover clang’s assembly, it marks C source code in the then branch of test, so it seems the optimization I expect to happen does not happen. Notably, if I just comment out the content of the else branch in test, clang optimizes foo entirely to a NOP. I do not understand why adding code after the optimized conditional changes the way that conditional is optimized. Help? :wink:


#70

Sure. Having miri complain about UB will certainly help. I’m just saying, the kind of model you’ve proposed is not inherently more intuitive to programmers than certain provenance-style rules.

:+1:


re: Why LLVM doesn’t optimize the way you want, first off I should stress that there may not be a good reason.

Between missing simplifications, unimplemented edge cases, pass ordering issues, passes intentionally not being as aggressive as they could be because it broke too much real code, odd interactions between different canonicalizations, etc. there’s a lot of room for failing to optimize even obvious things. For that matter, if the optimizer was maximally aggressive, it would just delete both of the volatile stores, since they are UB if executed. That it keeps them is probably a concession to real software using it to intentionally provoke crashes.

Now on to specifics. After some reduction, I’m left with this and looking at the LLVM IR. There’s one interesting thing about it: The comparison of i1 and i2 was simplified to a pointer comparison. Yet LLVM refuses (even when running the whole optimization pipeline over it again) to constant fold this with alias information.

Let’s see if the BasicAA impl does something weird here […] Turns out, nope, it does exactly what I expected:

On to instcombine! Some grepping for pointer reveals this comment:

// It would be tempting to fold away comparisons between allocas and any
// pointer not based on that alloca (e.g. an argument). However, even
// though such pointers cannot alias, they can still compare equal.
//
// But LLVM doesn't specify where allocas get their memory, so if the alloca
// doesn't escape we can argue that it's impossible to guess its value, and we
// can therefore act as if any such guesses are wrong.
//
// The code below checks that the alloca doesn't escape, and that it's only
// used in a comparison once (the current instruction). The
// single-comparison-use condition ensures that we're trivially folding all
// comparisons against the alloca consistently, and avoids the risk of
// erroneously folding a comparison of the pointer with itself.

So, there’s your reason. The patch that introduced this has some discussion which you’ll probably find interesting. I would be careful to extrapolate any larger policy or rules from this, though, as it might just be an implementation detail partially motivated by “we could be more aggressive but that would break too much real code”.


#71

(Edit: While I was sitting on this as a draft, @rkruppe found the same thing. Oh well :slight_smile: )

Ooh. I played with your test case a bit and ultimately went to the LLVM source. And here’s the reason:

  // It would be tempting to fold away comparisons between allocas and any
  // pointer not based on that alloca (e.g. an argument). However, even
  // though such pointers cannot alias, they can still compare equal.
  //
  // But LLVM doesn't specify where allocas get their memory, so if the alloca
  // doesn't escape we can argue that it's impossible to guess its value, and we
  // can therefore act as if any such guesses are wrong.

…I’m not sure how principled this limitation actually is.

For one thing, as noted in the PR that added this check, you can fool it by iterating through the entire address space in a loop, or indeed just by ‘getting lucky’. (Interestingly, I was able to replicate the former, but only by disabling certain unrelated optimizations.)

More importantly, the limitation doesn’t apply to optimizing memory accesses based on aliasing assumptions, so this would seem to be enforcing a model of “provenance matters for loads/stores, but not for comparisons”. But that doesn’t really seem very useful.

Still, it’s interesting.


#72

Wow, that’s indeed interesting! So they consider it fine to do this optimization if the alloca address is used only for comparisons, and they ensure that all comparisons are optimized consistently by not touching this if there is more than one comparison. In other words, they seem well aware that they are stretching the limits here. :wink:

However, this means that it is actually conceivable that this could hold up in a sufficiently complex formal model. @comex pointed out that “one comparison” does not mean “one comparison executed”… but I failed to exploit this: https://godbolt.org/g/zFH4nU

I see. Well that’s a pity because these tricks are much harder to play with actual loads/stores. Which is probably why LLVM feels more free to do stuff. :wink:

In that discussion at LLVM, somebody writes

I also verified that it didn’t make the optimization fire any less over Chromium (this patch reduced binary size by about 10k).

oO I didn’t expect this to have any actual impact, really. (Of course, the case I consider troublesome is only the one where the ptr it cast to an int before the comparison happens.)

we can act as if it allocated the memory in a place that the programmer isn’t finding with their guesses.

… except they may actually be exhaustively searching the entire possible space…


#73

Given that they ban ptrtoint, the “intended” model is that allocas can be allocated in “imaginary space” and have an address that is not a number. I’m still not sure how to square “imaginary space” with DCE of arbitrarily-complex expressions including memory accesses, but it seems like something that “could” be done.


#74

My thought was that one would reacquire everything that is still live when leaving the unsafe block. Thus, if there were two &muts that got released going into the block, they would be reacquired when leaving. If the unsafe code changed their value to point to the same location, that would UB and detected as soon as execution left the unsafe block. If the unsafe code stashed a pointer to the referenced memory, and a later call to a safe function tried to access it, that would also be UB and be detected at the moment of access.


#75

I suspect giving semantic significance to the scoping of unsafe blocks would break way too much existing code, given that up to now, it has had none, and as a consequence plenty of code has made the purely stylistic(!) choice to e.g. only put unsafe { } as a marker around the individual unsafe operations themselves (like unsafe { *p }), which would now qualify as being hilariously wrong in a lot of cases.

(Which is a shame because I think it’s an elegant formulation - the fact that the scoping of unsafe has no significance is irksome to the language designer in me.)


#76

So this is copying the value out of a location pointed to by the raw pointer p? (As far as I understand, the code fragment is only valid if *p is Copy.)

If p doesn’t alias a safe reference, then this is completely fine. If p aliases a non-mutable reference, this is also fine. If p aliases a mutable reference in the current function, this is fine with both function-level and unsafe-block-level granularity, as either way the lock will be released at the point the access occurs. If p aliases a mutable reference somewhere higher in the call stack, then this will be undefined behavior with either level of granularity.

I don’t disagree that there will almost certainly be some existing code that would be invalid with unsafe-block-level granularity but would be valid with function-level granularity, but unless it is a significant percentage of current unsafe code (which I doubt), I think the benefit of being able to say “types can be trusted outside of code explicitly marked unsafe” is very much worth it.

There will undoubtably be some code that breaks regardless of what we do. (There were some examples of existing code put forward in previous unsafe discussions that I think should be invalid in any model.) Thus, I imagine we’ll want to implement the sanitizer long before we implement any of the optimizations, giving a good amount of time for invalid unsafe code to be fixed.

Once the basic sanitizer framework is implemented, hopefully it will be fairly straightforward to run cargobomb with both granularities and determine how much unsafe code is affected.


#77

They ban ptrtoint, but they also optimize away ptrtoint before this check, right? My original example hitting this specifically did ptr-int-casts.

@rkruppe: Btw, I realized something over the weekend: If pointer comparison is indeed non-deterministic in some cases, then the equivalence I stated above is violated. I am not sure how to feel about this. I have come to accept non-deterministic pointer comparison mostly on the grounds of “well, that’s just what it is in Rust right now”. OTOH it was nice to have some concrete benchmark to “test” whether a model does provenance tracking.

Liveness has not been relevant so far in my model, so that sounds like a significant extension.

What I would want to is something where essentially the unsafe block becomes a separate function, and then we release what flows into the function and acquire what comes back. But that of course does not work, the entire point of unsafe blocks is that there can be *mut flowing into them that are dereferenced even though there is no ownership transfer.

For for now, I just implemented two modes: Functions that contains unsafe or are marked unsafe do not hold on to any locks (i.e., they acquire-release arguments and other function’s return values, and that’s it) – or they are treated like safe functions (which leads to a bunch of failures in libstd). This still breaks mem::uninitialized which returns invalid data, but from what I understand that was considered a problem anyway and the plan is to move to a MaybeInitialized union type.


#78

Sure. That’s what makes it hard to find a formal model to…


#79

Right. However, if we had a flag to turn off just that one transformation – make it never optimize away ptr-to-int casts (well, they can be removed in dead code etc., but the not in some form of simplification) – then something like intptrcast should be pretty much correct. Furthermore, Rust can still emit tons of noalias information based on types, so we should still be able to get most (all?) of the memory-related optimizations we want.

Of course, I have no idea of such a flag would even be something LLVM would be willing to accept.


#80

This needs to be made more precise to talk about how likely it is to be accepted by LLVM. By “dead code” you presumably mean statements that are dynamically never executed, but in LLVM terms, “dead code” generally includes side-effect free computations that are unused (because often times, “code” ~= SSA values, and unused definitions could rightly be called dead in a different way). Being able to strip such computations is a very basic and essential transformation, so trying to inhibit it will run into big practical problems:

  • The easy, hacky ways of making DCE keep int-ptr casts (such as marking them as having side effects, or touching memory) will adversely impact many other optimizations.
  • There are probably quite a few places that do some ad-hoc DCE or something morally equivalent, making it invasive to (conditionally or not) exempt int-ptr casts specifically from DCE.

And it’s even worse if you instead try to find a definition that’s based on whether statements are executed, because then you have to account for code motion. For example, if the result of an int-ptr-cast is only used in dead code (as in, the cast is always executed, but none of its uses are), then many optimizations will gladly move the cast closer to the use [1], potentially moving it into a dead block. Disabling motion of int-ptr casts will probably require one of the aforementioned hacks (side effects, memory access, etc.), with the same adverse effects.

Not turning comparisons of inttoptr results into comparisons of the casted pointers seems simpler (though I haven’t thought about it as much), but I guess that may not be enough?

[1] This is assuming the dead block hasn’t been eliminated already, but that can easily happen if the block isn’t trivially dead and the code motion runs before a more powerful pass that can prove the block dead.