Operational semantics and high-level vs low-level

What ends up happening otherwise in practice, if we go for an axiomatic approach is that the implementation becomes the spec, and there is really nothing (at least not a formal mathematical definition) to judge its correctness against, because you may have mutually incompatible stated properties.

I don't think Rust needs to solve every problem, especially those that cannot be well-specified. It seems more appropriate to provide a solution at the level where guarantees can actually be made (and not just claimed to be made).

I also don't see what's specific to systems programming in the use case here. Why would it e.g. not apply to Haskell but apply to Rust? Moreover, I do not find arguments of the form "A is a systems language, A has X feature. B is a systems language, so therefore it must also have X" persuasive. Ideas need to stand on their own merits, not on whether e.g. C or C++ also do something.

Characterizing languages like C, C++, and Rust as high-level languages is not a new phenomenon. These are general-purpose languages as opposed to domain-specific because... well, just compare Rust to a language tailored for creating music.


Which is not a problem in practice. Rust is a counter-example on its own, given that it does not have (or hasn't had, if that is not true anymore) a formal spec; and yet that does not stop us from using it.

Those claiming that languages need a formal specification covering every single corner (rather than an informal one, or even no specification at all) are the ones that have to prove such a level of formality is needed. Note that having such a formal specification is, of course, great, but please don't let that get in the way of engineering.

I think most people would agree that Rust current main goal is to be a better, safer replacement for C and C++, i.e. a better alternative for systems programming. Given that a subset of systems programming projects need to solve this problem (e.g. Linux, OpenSSL), then it would be ideal that Rust provides a solution.

Rust (and C, and C++) compilers can very much provide this guarantee. A particular abstract model may not, but that is a problem of the spec or the chosen model being unsatisfactory. Again, as I said in the first point above, that is not a bad thing: coming up with those models is hard.

I only said systems programming languages need a solution for that problem. Whether non-systems languages require a solution or not is another (interesting) discussion.

That is not the argument I presented, though.

The motivation presented in my paper is pretty much irrelevant of whether we are discussing Rust, C or C++. In fact, I am going to send a similar RFC for Rust with pretty much the same content.

What that article argues about is controversial and many experts in the field definitely do not agree on trying to mold languages like C into "high-level" languages without regard for well-established semantics [1]. The reason is simple: C was born and is still used for actual low-level systems programming, and some of those changes and new characterizations are effectively backwards-incompatible with already written code years back. In the process, they have introduced bugs, more work for engineers and, in the worst case, security issues [2] [3] [4].

Or, picking this from another angle: if we were to call C, C++ and Rust "high-level general-purpose", then the moniker would not be very useful as there would be no "low-level general-purpose" languages around, at least those common and/or intended for usage by humans.

I was discussing "high-level general-purpose". Rust isn't a DSL, of course.


I'd very much like to have such a guarantee. But this is a hard problem, and it doesn't go away by saying "there must be a solution".

Not caring about the (complex and subtle) operational behavior of languages such as C, C++, Rust and LLVM-IR is how we ended up with bugs like this or this: without a precise definition of what exactly the observable behavior is that has to be preserved according to the "as-if" rule (what we call "[contextual] refinement" in academia), we are doomed to use a flawed intuition and end up with sets of optimizations that all seem to make sense in isolation, but are just plain wrong when combined.

Doing so is the only technique that I know of that can hope to prevent the kind of bugs I cited above. I'm open to other solutions to this problem, but I don't know any.


Doing "just engineering" is what got us where we are: all major compilers have serious bugs in their optimizations that stem from fundamental misunderstandings about the actual semantics of the language they are implementing. The bugs I cited in my previous response were all related to pointer provenance, and similar bugs are present in ICC and MSVC (based on statements of some of my coauthors on this paper). I doubt pointer provenance is the only such case, it is just the biggest one we found so far.

The fact that all major compilers have the same kind of bug shows that this is not the usual bug where someone made a small mistake in implementing a particular optimization; this is systematic. Something about how these compilers are developed is fundamentally flawed. I believe the flaw is in not having a proper idea for what the actual semantics are of the language that is being optimized, using a flawed mix of "what the C spec says" and "what the hardware does" instead. I think this is a proof that some "level of formality is needed".

We should prevent such issues from arising again. The only technique that I know to ensure this is to take an operational view, to equip the IR on which the optimization happens with a precise operational semantics with a notion of "observable events" and argue that the optimizations do not change the observable behavior. This requires making explicit many things that are usually left implicit, like the fact that memory in these languages doesn't just store plain bytes.

Note that we do not necessarily have to make this fully formal and mathematically tight. Having, for example, a reference interpreter for a language (one that adequately models not just the "good case" but also all possible causes of UB), would probably be sufficient. Something like Miri -- though Miri does not currently explicitly model what is observable and what is not.

I am not entirely sure what your argument here is. Your sources all give good reasons for why C maybe shouldn't be a high-level language, but that doesn't change the fact that these days, it is (at least the variant of C implemented by compilers such as gcc and clang), and applying low-level reasoning leads to logical contradictions.

There is a fundamental trade-off for a language: either it is low-level or is is aggressively optimized. Todays C/C++ are the latter, and does does Rust. Any way we can find to break this trade-off is welcome, but it should be a proper solution, not hand-waving.

As a case in point, when it comes to union semantics (one of the issues raised in your sources), I stronlgy argued for giving Rust the simplest possible semantics I can imagine, allowing all forms of type punning and whatnot, precisely because of how bad an idea the C rules turned out to be. Due to this, Rust does not have a notion of an "active field" of a union, or other surprising forms of "hidden state". Being explicit, operational, formal and high-level doesn't contradict taking into account low-level concerns. But not being explicit, operational and formal is how C and C++ got where they are: the language semantics are so unclear that informed and intelligent people can argue for hours about their details. I strongly think that we can do better than that in Rust, but we'll have to adjust the methodology to achieve that.

As I said before: this is a hard problem, it doesn't go away by pretending it doesn't exist.


Agreed, but the actual problem (users implementing some solution for that) does not go away either. And we end up with projects being wrong and unsafe, and sometimes not known until months have passed because the compiler didn't expose it.

Some of those bugs are created by standards and compilers trying to be too smart and introducing that complex and subtle behavior that nobody asked for to begin with. We all want powerful transformations, but we should not let that get in the way of making (unsafe) code almost impossible to reason about with confidence. While Rust can attack this problem nowadays in a much better position, the way it was done over the years for C and C++ was definitely wrong since it broke existing projects.

What about not implementing a model or optimizations that are not in line with what people use the compiler for? Note that I say this as someone that thinks Rust can get away with a lot more than C/C++ in this, because it is a "new" language (in the sense that there is not decades of code around that were written with other expectations of behavior).

I am not a compiler writer/researcher so I cannot really address the rest of your quote. However, what you wrote in this sentence is precisely what I asked about in a reply above. Yes, having a hole in the model is bad, and it may lead to bugs in the compiler and its passes at some point or another. However, those dealing with that (compiler writers) are way more capable to solve it than users of the compiler, and they can solve it centrally for all their users.

Those sources are arguing to go back to the previous behavior, for good reason. The fact that some compilers implement a broken language (for a particular purpose) does not mean C is high-level either. In fact, as you very well know, GCC and Clang have had to backtrack on some of their decisions and/or add compiler switches so that they didn't break existing code.

So, what is "C"? Whatever a compiler team thinks it should be based on some reading of the standard today, or what the users and their millions of LOC have been using it for, for decades? Should we ignore all existing code just because it allows us to implement a fancy transformation? Should we demand engineering all over the world to change their code?

An example is Linux, as I mentioned in the reference, but it is not by any means alone. I have worked in some projects where optimizations were completely disabled because "legacy" code was not going to be replaced/rewritten at all and very expensive equipment depended on it. Yes, we can come today and say that it was "broken code" according to the standard or the newer compilers' reading of it, but the fact is that code like that exists everywhere.

Note that I don't really care whether C is one thing or the other. But I do care compilers and standards bodies changing the semantics. I am all for enabling new behaviors and opportunities in new C/C++ standards (and I enjoy them in some projects!), but please do not say the new standards are (all of) C/C++ (even if ISO tells us they are :).

[In any case, I think this is a bit off-topic and quite controversial]

I acknowledge it is a hard problem, but similarly, we can say: pretending Linux, OpenSSH and other major projects are not facing the ("actual") problem does not go make it go away, either.

Note that in my discussions on this topic with the C++ committee, some experts agree with your assessment and its difficulty, but many (including compiler writers) also agree that the problem should be solved by the standard anyway, one way or another (i.e. even if a formal approach cannot be found right away, the standard should provide the means to users to deal with the issue).

1 Like

Here's a simple, high level overview over where I think the disconnect between you and @RalfJung is.

Compilers have two main jobs: compile code to get the correct result, and compile code to get the correct result quickly.

The former is more or less specified by the language specification, however it exists, in a somewhat formal definition or just by the behavior of existing compiler(s).

The latter is achieved by applying optimizations to the generated code/IR. This is made reasonably possible exclusively by "dead space" in language semantics: the "undefined behavior.

It is impossible to add new (meaningful) optimizations without exploiting this dead space. This means that any new optimization (or bug fix!) will lead to code that existed on the old compiler giving a different result. (At the very very minimum, timing is different, and that is technically observable.)

Though it may not be your position, it comes off as if you're effectively advocating for any change in the result of a compiled program to be disallowed. This is because giving the compiler to the entire world to work with is basically isomorphic to every possible program being written. This argument simplifies down to no more optimizations, permenently, because it will break someone.

There's a reason that there aren't any (good?) raw assembly optimizes: it's downright impossible to optimize "what the machine does" programmatically. Any higher level language (here: has some abstraction layer between it and the hardware) adds extra rules (and thus, undefined behavior for breaking those rules). That dead space is exactly what allows optimizations to work.

Your argument is basically that the formalization effort needs to consider practical uses and lend itself towards comprehensibility. Nobody disagrees with you there. In any language that exposes a way to write something that is UB, it should be as simple as possible for the author (with the help of static tooling) to avoid causing UB.

At this point I'm going in circles, so I'm not going to go off the point of this thread any further.

But the key point of this is that a feature like this which is intrinsically dependent on the actual hardware (as it sits exclusively within the designed dead space of the language) is not and cannot be specified/guaranteed by the language specification. At maximum, it can be promised by a specific implementation/backend to fulfil specific hardware-level goals.

Optimizations don't break code, they make logical inferences based off of UB impossibilities that make sense locally to transform code to be quicker. This breaks when the author does not come in with the same exact assumptions. And not even all optimizations are solely modelable as source transformations.


The current state of the real world made me disconnect from this for a while -- I am restarting this before the thread is automatically closed. :slight_smile:

I know you are trying to help make a bridge between us by giving an overview, so I thank you, but I think everyone in this discussion knows how optimizers work and what UB is, so I am not sure where are you going with the explanation.

No, that is not my position.

No, the feature I am proposing does not depend on the actual hardware. It can be implemented as a restriction on the optimizer without knowledge of hardware (and it is, in fact, how it is implemented in most cases as shown in the paper).

It can also be specified in a language, as shown by the C standard (whether that specification is not "proper" for any number of reasons according to some people is a different matter which I leave to spec experts to deal with).

The whole memory clearing goal is somewhat misguided, since for security you need the whole memory of your program to be secret at all times, and if the memory is secret, then it doesn't matter whether you clear it or not, since it's always secret anyway; at best, clearing is a mitigation for other security issues.

Also, the idea that you can somehow identify some data that is "more sensitive" than other data is also quite dubious, since in general all data is sensitive.

Hence, if we really want this, then the best bet is to engineer the compiler so that ALL memory is cleared "as soon as possible", and in particular so that "dead values" are kept alive at most a constant number of instructions after they are dead.

This essentially means that, when an option is enabled, just before any CFG backedge and any call instruction, all unused registers and parts of the stack must be cleared, all data structure assignments must clear all padding, the memory allocator also must clear memory at free time, and data structures must be implemented so that they clear unused parts.

This would require modifying LLVM to clear registers and stacks, modifying rustc to clear padding, adding a "discard" instrinsic that zeroes memory whenever the compilation option is enabled, changing the Rust allocators to use it, and changing liballoc to use it to clear unused parts of Vec, HashMap, etc.


A colleague of mine employs an analogy. Cryptographic primitives are various forms of "secrecy compression". Sensitive plaintext tends to be, over time, quite large (such as the sum total of a long session). Cryptography allows us to compress the exposure down to the key used to encrypt the data, and it turns out that it is easier to keep short values secret than large values. Unfortunately, this also magnifies the danger of losing a key, which is why keys, rather than plaintexts, tend to be primary targets. It is much easier to leak a poorly managed key through a side channel (100s or 1000s of bits) versus a plaintext (HTTP requests and responses are huge).

That is to say, yes, certain classes of data are more sensitive, because the value of an individual bit of a secret key is much higher than that of a single bit of the plaintext.

Stolen keys can also be used to defeat authentication schemes like MACs; the actual MAC doesn't exist for very long, so stealing it just lets you replay a message, which, odds are, is nonced.


It would be good to have a explanation/source this claim.

In any case, even if that were to be true, it does not mean clearing memory is futile. We clear because it lowers the chance of the information getting out through any other means, like hardware bugs or unintended core dumps, as explained in the paper.


Agreed, but the point is that since such a problem is not trivial at all to solve and will take a long time if at all successful (see e.g. a prototype in GCC referenced in the paper that clears the stack), the idea is to handle "better" rather than "perfect". Clearing memory is what current projects do anyway, so we are simply standardizing current practice.

I seriously question this premise

Disclaimer: I am reciting stuff that I learned in history class, not personal experience. And I am definitely oversimplifying.

You're missing a second aspect, which seems like a more obvious driving behind early cryptography development. It's not that it makes the secret smaller; it's that removes time-based coupling of the security measures and the information that you wish to secure.

In a military application, the soldiers would be given the key while they are still at the heavily-guarded base, at whatever time is convenient. So when the soldiers are away in the battle field, they exchange encrypted messages by courier. The reason they didn't just share the messages while still at home base isn't because the messages are too big, but rather because the information they wanted to share didn't exist yet.

If the cryptosystem itself is perfectly secure, then the whole system is as secure as the pre-deployment base, while being almost as convenient as an unsecured courier. If it was all about size compression, the Soviet Union would never have bothered using the one-time pad (which requires keys, plaintexts, and cyphertexts to be exactly the same size; the OTP's extreme inconvenience is cited as the reason why they frequently used it wrong, but that's because doing it right would have required them to return to base too often to rotate keys, not because of the size of the key per se).

This time-based view of the underlying purpose of cryptography applies to other situations, too. Some of them, like Secure Boot, are almost perfectly analogous (the factory is the base, the internet is the courier). Others, like TLS, make the assumption that the internet link between the LetsEncrypt servers and your servers is more secure than the one between your server and your end user. Even trust-on-first-use, as tenuous as it is, exists because they assume that the time when you first connect will be more secure than all of the subsequent connections. But in all cases, the key exchange happens over a communication channel (secured through methods that have nothing to do with cryptography, or through a different cryptography system that, eventually, reduces to a "root of trust" that isn't based in cryptography) that is no longer accessible at the time when the cyphertext needs to be sent out.


I should clarify that the person who likes to make this analogy is the author of a major TLS library, so it mostly applies to that domain. Security and cryptosystems are a complicated topic that can't be summarized by a pithy proverb; "secret compression" is mostly a way to think of what a modern cryptographic primitive gives you, as a black box. You still need to get the relevant key to the person receiving the message, which is all about the cryptosystem and nothing to do with the cryptographic algorithm.

While that is true about OTPs, that is not true of modern cryptographic algorithms (though key size is an increasing worry for those of us working in the post-quantum domain). Key size is actually a huge concern, both for TLS, which doesn't want to waste bandwidth on handshaking, and tiny secure micro-controllers that need to generate keys (having most of your 512kB of RAM or get filled with key material is very obnoxious). After all, the benefit of OTPs is that they are unbreakable: this is not known to be true even of AES, though we're pretty confident. True security comes at a high information-entropy cost.

Also: I don't really want to get into an online argument about threat models. I think that's a topic we could unproductively bikeshed for a while. Even this small sidebar is a bit of a derailment.

Also, small nitpick:

This is not true. TLS does not trust any TLS connection more than other. Your browser trusts the LE cert because of a certificate chain that starts with a root cert burned into the browser. You do eventually need to bootstrap trust through the operating system distribution, but it's a matter of trusting an OEM's supply chain or the Debian image you just downloaded.


I believe @notriddle is referring to the certification issuance process, where Let's Encrypt checks who should be issued certificates for a given domain by making an HTTP connection to that domain (among other possible verification methods).


Ah right, yes. My bad. =P

I believe that's actually more of an LE thing than a TLS thing? I'm not familiar with how LE actually issues certs? shrug

What I mean by "dependent on the actual hardware" is that you cannot observe the difference between (stack) memory being zeroed after final use and it not being zeroed. So by the as-if rule, an optimizer is free to just not do the zeroing. Because, by definition, you can't read the memory anymore, so it's just uninitialized space that can't be read within the language specification.

Specifying this operationally (iow, in the language specification) requires allowing something along the lines of defining what an external process reading arbitrary memory of your process sees. And that's roughly isomorphic to "no more optimizations," because then any new optimization is observable and breaks the as-if rule.

Now, you probably can define what it means to zero (stack) memory after last use in a formal way,

But this is not part of the language specification, though it may be part of the "language specification documents;" it is not specifying (observable, as-if optimized) behavior of the language, but rather implementation requirements for a conforming compiler. These are subtly different.

The language itself would not be the provider of the zeroing guarantee, as it, by design, cannot be observed in the language and thus can be legally optimized out. Instead, this is a guarantee of compiler/backend pairs, provided on a case-by-case basis; just potentially one that is required if you want to be called a conforming compiler.

(That took me a while to remember my train of thought and how to clarify it better. I hope this helped clarify the problem here: I think it's disagreement on what the language specification actually is.)


The art is to carefully define what is and is not observable. For the related question of "(cryptographic) constant-time code" and the compiler guaranteeing not to break that, this paper carries out the exercise in full formality (extending CompCert to obtain a machine-checked formally verified crypto constant-time preserving compiler). I believe similar mechanisms could be used to also define a "memory-zeroing-preserving" compiler.


We cannot under the current specification.

The as-if rule (and the definition of what observable behavior is, etc.) is part of the specification and therefore can be circumvented or modified if the spec wants. That is, after all, the point of proposals! :slight_smile:

In fact, if you take a look at how memset_s was specified in C11, that is exactly how it was done. Another example are volatile accesses which are part of the observable behavior.

It is, because the language specification defines how those functions are special with respect to the as-if/observable behavior, and that is what defines the restrictions on the optimizers.

For instance, if you had no observable behavior at all, your optimizer could output a noop.

No, they aren't. In fact, the entire point of the as-if rule and observable behavior is to describe what implementations can and cannot do. These are their definitions (emphasis mine):

  • As-if: "In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not (...)"
  • Observable behavior: "The least requirements on a conforming implementation (...)"

Therefore, if you argue that the language specification should not talk about conforming implementations, then the as-if rule and observable behavior wouldn't be part of the specification either.


I think the problem is that there is a fundamental tradeoff between properly supporting low level features like "secure clear" and performance.

For performance, you want your abstract model to be as high level as possible so it is possible to transform the code to make it faster. Meanwhile, caring about stuff like external views of memory or whether instructions execute in constant time essentially makes optimization impossible. At that point, your model is so low level that you're basically writing assembly.

If you're designing the language from scratch, you could introduce some notion of "volatile memory region" which the compiler isn't allowed to copy or optimize, along with appropriate operational semantics, and then any code which isn't using volatile memory can mostly still be optimized. However, it still adds complexity to the compiler, since every single optimization pass has to the consider whether the code in question might be using (even indirectly) volatile memory, and some popular optimizations are still ruled out entirely. And it's still possible to accidently leak the contents of volatile memory when you read it into a variable and that variable gets spilled to the stack. And this is the best possible case, when you design it into the language from the ground up.


That's not inherently correct. The requirement is to suppress data-dependent timing variance for sensitive data, not to inhibit all optimization. For example, loop-independent computations can be lifted out of loops, and expressions that do not involve sensitive data can be optimized as formerly. The primary need is to eliminate conditional branches based on sensitive data, followed by eliminating indexing/addressing based on sensitive data. (Note that this latter implies crypto algorithms that do not use table lookup, unless the entire lookup table can be pre-loaded and then locked in the level-1 cache.)

A reasonable approach would be to declare which variables and struct fields are sensitive, to track which objects, including hardware registers and temporaries, are "infected" with sensitive data via computation and assignment, and to zero those objects when they are dropped or go out of scope unless the compiler can prove that they (e.g., registers) are reused either A) in the same basic block or B) in all successor chains of basic blocks. I suspect that this latter optimization would be implemented by inserting such clearing code everywhere it is potentially needed, then removing it in a late compilation pass only when A) or B) is observed.

That doesn't quite work, since some amount of "infection" is desirable. It would be kind of sad if, once you were done using a secret key to sign a message, the compiler helpfully zeros out not only the key, but the signature as well!