Idea: traits for zeroizing before and after move

The idea is quite simple, let's add two special derivable marker traits ZeroizeMoveSrc and ZeroizeMoveDst (names are up to bikeshedding), which will slightly modify the "move is a simple memcpy" rule.

If ZeroizeMoveSrc is implemented for a type, then immediately after memcpy source bytes on stack will be overwritten by zeros using volatile write. After value of that type goes out of scope or after its dropped, previous location of that value gets zeroized as well (i.e. we can view those operations as moving data into nothing). If type implements this trait and contains another type which also implements it (Keys type in the next example), then zeroization will be done only ones.

#[derive(ZeroizeMoveSrc)]
struct EncryptionKey([u8; 16]);

#[derive(ZeroizeMoveSrc)]
struct Keys { enc_key: EncryptionKey, mac_key: [u8; 16] }

Note that only bytes on stack will be zeroed out, so data on heap behind b will not be erased for the following type:

#[derive(ZeroizeMoveSrc)]
struct Foo { a: [u8; 16], b: Box<[u8]> }

But if data is moved from heap, then original location on heap will be zeroized.

The main use-case for this trait is handling of a sensitive secret data (e.g. cryptographic keys). We already have a number of crates which target this problem (e.g. zeroize) via a specialized Drop implementation, but they have several problems. The most important one is that they can't deal with moves at all. Another one is related to performance: with the Keys type zeroization on drop will be done twice, which may hurt performance.

ZeroizeMoveDst will zeroize destination memory before moving data into it (write can be non-volatile). This will done before creation of a value implementing this trait as well. Similarly to the ZeroizeMoveSrc trait, zeroization of destination memory will not be duplicated if type contains another type which implements ZeroizeMoveDst. The main use-case for this trait is safe transmute of a value into raw bytes using function like this:

use core::mem::size_of;

fn into_raw_bytes<T: ZeroizeMoveDst>(val: &T) -> &[u8; size_of::<T>()] {
    unsafe { &*(val.as_ptr() as *const _) }
}

This approach will allow us to work around the problem of undefined padding bytes without debating semantics of freeze and with only minor performance impact (since zeroization of destination memory is not volatile it can be removed if not observed).

Relevant links:

2 Likes

@newpavlov curious if you saw the "stack bleaching" approach as an alternative to this:

(Edit: d'oh you linked it already! :sweat_smile:)

See the list of relevant links. :wink: I think they are complimentary to each other, since sensitive data may originate from outside of a bleached area.

1 Like

How do you expect unsafe code to handle this? If I put one of theses types in a Vec, what happens when it reallocates?

6 Likes

Hm... It's a really good question. I don't have a good solution right now.

The best I could come up with is something like this. Let's merge the traits into a single one:

trait ZeroizeMove {
    // true if source should be zeroized after move
    const SRC: bool;
    // true if destination should be zeroized before move
    const DST: bool;
}

By default all types will implement this trait with both constants equal to false and all generic type arguments will be implicitly bound by where T::SRC == false, T::DST == false (hopefully it will be possible with an extension of #20041). So by default all structures will not work with types which overwrite ZeroizeMove implementation. If generic type/function/etc. would like to support such types, then by adding an explicit bound on ZeroizeMove the implicit bound will not be generated.

I know it's probably a bit too much complexity for such feature... And ideally it would be nice to make a correct support of zeroize moves one of the requirements for unsafe code, but I couldn't find how do it without breaking safety properties of already existing code.

UPD: Maybe it would be possible to do something like this. Let's say we'll get 202x edition in which core will get ZeroizeMove, all non-202x crates will have an implicit bound on it with both constants equal to false. 202x crates and later will also have such implicit bound, but without constraining constants. So for collection outside of std to support zeroize move types it will have to migrate to 202x edition. Since this trait will be auto-implemented for all types, such implicit bound should not cause any breakage.

But that means it will almost always be removed, which means it doesn’t protect data in practice. What’s the point?

1 Like

This part is about zeroizing destination memory before move. The main (if not the only one) use case for it is safe transmute of types with padding bytes into raw byte arrays. For secret data you use the other trait, which zeroizes source memory after move.

I think the answer would be that "typed copy" (e.g. as ptr::copy::<T>) would apply the zeroize-first, "untyped copy" (e.g. ptr::copy::<u8>) would not, and we introduce a new explicit "typed move" ptr::move::<T> that applies the zeroize-after as well as zeroize-before, as requested. It would mean that 99% of uses would have to migrate from ptr::copy to ptr::move to respect the zeroing request, but the "worst" that happens for using copy instead is bytes remaining in memory a longer than strictly required.

Adding the zeroing behavior onto ptr::copy would be both breaking and surprising, but makes sense on a ptr::move (which I think captures the intent of using copy in most cases better). The only issue with ptr::move is that move is a keyword. (Just use _overlapping and _nonoverlapping?)

1 Like

I proposed an experimental approach to "user-defined move constructors" that uses Pin + macros to build "unmovable" types, and then a second unsafe Transfer trait to execute user defined constructors.

I even use "safely erased numbers" as a unit test example (without actually providing a volatile write for zeroing, as it is not my point in these tests).

Would such approach help with the use case of your proposal?

“Move constructors” have been proposed dozens of times in the past, for example:

They’ve been consistently rejected because:

  • “moves are just memcpys” is considered an important and valuable guarantee of Rust, and it’s one lots of unsafe code already relies on, so this is likely not even allowed by our stability policy
  • so far, the major problems for which move constructors have been suggested as a solution (especially self-referencing types) aren’t actually solved by them at all on closer inspection

The zeroing discussions I’ve read so far strongly imply that the second point is true of zeroing too, because there are apparently cases where zeroing on every move is far too slow and instead you want to zero when exiting a function.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.