Annotations for zeroing the stack of "sensitive" functions which deal in transient secrets

Centril · January 11, 2020, 5:02am

Which is why e.g., bench_black_box was accepted as a hint, and not a language level guarantee. As long as things cannot be specified operationally, I'm not in favor of providing such guarantees.

Which silicon is that specifically? For architecture-specific intrinsics, to the extent we provide guarantees, we only do so on that architecture. Beyond that, when guarantees, rather than something best-effort is sought after, I think the right level to make the acknowledgement is not in a high-level general-purpose language like Rust.

mcy · January 11, 2020, 5:28am

Do you think volatile access should not be a language level guarantee?

Centril · January 11, 2020, 6:10am

My understanding from @RalfJung's notes re. "externally observable events" is that this is specifiable operationally:

github.com/rust-lang/unsafe-code-guidelines

What about: volatile accesses and memory-mapped IO

opened 09:59PM - 09 Oct 18 UTC

closed 07:44PM - 06 Jun 23 UTC

cramertj

C-discussion-topic A-memory C-open-question

Folks who want to write drivers and embedded code using Rust need to have a way …to guarantee exactly-once access to certain memory locations. Today, the embedded wg makes extensive use of @japaric's [`VolatileCell`](https://japaric.github.io/vl/vcell/struct.VolatileCell.html) crate, along with [RegisterBlock](https://github.com/japaric/stm32f103xx/blob/d22b9c25f3e685c56969da4111d4260a70790338/src/fsmc/mod.rs#L3) structures containing `VolatileCell` wrappers around each field of the register block, and a function to provide [a single access to](https://github.com/japaric/stm32f103xx/blob/d22b9c25f3e685c56969da4111d4260a70790338/src/lib.rs#L1486) the register block at a [fixed address](https://github.com/japaric/stm32f103xx/blob/d22b9c25f3e685c56969da4111d4260a70790338/src/lib.rs#L436). The API exposed in the the `stdm32f103xx` crate and similar only expose `*const RegisterBlock` values ([example](https://docs.rs/stm32f103xx/0.10.0/stm32f103xx/struct.FLASH.html)) from the overall [`Peripherals`](https://docs.rs/stm32f103xx/0.10.0/stm32f103xx/struct.Peripherals.html) object. This then requires unsafe code to access and mutate any particular field. Asks: - Is this pattern sufficient to guarantee that the number of writes to IO-mapped memory will exactly match the number of calls to `unsafe { (*x.volatile_cell_field).set(...) }`, and that the number of reads will exactly match the number of calls to `unsafe { (*x.volatile_cell_field).get(...) }`? it seems like it should be. - Is it possible to provide the same guarantee while exposing the register block via a safe reference type such as `&`? It would be possible to provide a custom `RegisterRef<'a, T>` that consisted of a raw pointer internally as well as a custom derive for projecting this to fields of the register block, but this seems unfortunately complicated and unergonomic. Complicating factors: - LLVM's precise definition of "volatile" is a bit shakey. It says that optimizers must not change the number of volatile operations or change their order of execution relative to other volatile operations. However, it doesn't seem to specify that non-volatile operations can't be inserted-- this is something we need to prevent, but which LLVM might insert in an attempt to pre-load a value (as allowed by the "dereferencable" attribute that we apply to references). Can we make sure that LLVM doesn't do such a thing? If we fail in that, could we potentially make the compiler understand that `VolatileCell` is special, similar to `UnsafeCell`, and cannot have "dereferenceable" applied to references to it (and objects that contain it), in order to prevent this misoptimization? This seems potentially more complicated and intrusive, but IMO still worth considering. cc @RalfJung @kulakowski @teisenbe @rkruppe

rpjohnst · January 13, 2020, 10:58pm

This may be the real disconnect- @Centril does seem to disagree, and suggests we exclude some tools and scenarios because they are hard or impossible to specify this way. That suggestion is what gets this kind of pushback, and arguments about "systems programming."

bascule · January 14, 2020, 2:19am

If there are concrete questions about how a feature like this relates to e.g. the Rust Abstract Machine or otherwise, I am probably the wrong person to ask, but if people who are curious/skeptical about a feature like this can put together a concrete list of them, I know there are people who would understand such questions better than myself who are interested in making this proposal more concrete.

(Specifically some of them are LLVM developers who want to work on a similar feature for C++. I don't want this feature to be simply "LLVM errata", but if we can specify things correctly I think we can potentially achieve this feature in a way which works seamlessly across Rust and C++/"C compiled as C++")

comex · January 14, 2020, 10:42am

This is a fun argument to have, but this is not the right place to be having it.

For other features, ranging from volatile and asm to boring old FFI, I think it's possible to formally specify their behavior, but I don't know if you can call the manner of specification "operational". You have to define a mapping between Rust Abstract Machine and the lower-level process state, and say that at certain points they're required to "sync up" to a certain extent.

For example, suppose you write some data to a buffer, then pass that buffer as an argument to a system call. The kernel obviously has to be able to read the data you wrote to the buffer. But it doesn't know about the Rust Abstract Machine; it has its own, lower-level model of process memory*. For the Rust program to behave correctly, the compiler needs to guarantee that when you perform the FFI call, the lower-level memory state contains data at the buffer's address corresponding to the data you wrote there within the Abstract Machine. The abstract and lower-level states aren't always the same; they can diverge due to compiler optimizations such as reordering writes or eliminating redundant ones. But they have to sync up when you perform the call.

But none of that matters in this case, because buffer clearing – at least in the form of C's memset_s, the proposed secure_clear for C++ that was mentioned, or any of the nonportable ways that C/C++ programs often do it in practice – is only best-effort anyway. That means we simply don't need to worry too much about precise guarantees.

Why is it best-effort? In C, if you call memset_s or secure_clear, the compiler may be forced to clear out some buffer in the lower-level memory state. But:

Any previous operations that operated on the buffer may have left traces of the data in registers, on the stack, etc. This isn't just a theoretical concern; in fact it almost always happens. Usually only parts of the buffer are leaked, e.g. the most recently accessed word or byte, rather than the entire thing; and usually the relevant registers or stack locations will be clobbered by other data eventually. But there are no guarantees.
On the more theoretical side of things, the compiler is generally within its rights to make multiple copies of the entire buffer, especially (but not only) if it's stored in a local variable whose address never escapes. In other words, a single buffer in the Rust Abstract Machine state can correspond to multiple buffers in lower-level state, and the memset will only clear out one of them. It's just that this is not usually a profitable optimization.

Despite those objections, best-effort buffer clearing is still a useful operation, because in practice it usually does reduce the amount of sensitive information left in memory. I think Rust should support it, but it should be clearly documented as best-effort.

Alternately, clearing the entire stack range rather than just one buffer – as has been discussed – would, I think, solve most of the practical leaks, but it's still playing with fire. If we implemented that natively in Rust (which it doesn't really need to be; it works fine as a library feature), I'd still be reluctant to call it more than best-effort, even if I were talking purely about 'how the implementation works today' as opposed to 'what we can guarantee forever'.

Now, if someone comes up with a design, based on Cranelift or something, that precisely tracks where sensitive data is stored and can truly guarantee when it's gone... that would be awesome, and much more elegant. Though even that wouldn't be perfect: even if the data is gone from userland's view of memory, it may be around in state only the kernel can see, e.g. swap files. Regardless, the delta from here to there is a lack of implementation, not philosophical questions about how low-level Rust is.

* Though note that the kernel's view of process memory is still several abstraction layers away from "the hardware", such as virtual memory, CPU caches, etc.

CAD97 · January 14, 2020, 5:02pm

To be perfectly clear:

@Centril, as per our out-of-band discussion, is fine with a best-effort hint with the proposed semantics. It is the specified guarantee that they take issue with.

RalfJung · January 15, 2020, 7:09pm

Agreed, I was going to post the same. Adding a best-effort hint without guarantees is totally fine.

Of course, getting actual guarantees is a really interesting problem, but I think that's just further out there and doesn't have to block a best-effort attempt.

bascule · January 23, 2020, 7:10pm

FYI, a related RFC for adding a set of secret integer types is now up:

system · April 22, 2020, 7:20pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Operational semantics and high-level vs low-level language design	24	2944	July 4, 2020
Uninitialized memory	57	10317	March 25, 2019
Safely reading uninitialized memory	25	3112	March 25, 2019
Idea: traits for zeroizing before and after move language design	10	1699	May 17, 2020
Programming language vulnerability prevention recommendations from ISO WG23 language design	8	2582	March 25, 2019

Annotations for zeroing the stack of "sensitive" functions which deal in transient secrets

Related topics