Are `mem::{replace, swap, take}` unsound?

Mokuz · March 23, 2023, 9:08am

This post is about wether mem::{replace, swap, take} are unsound because they allow getting an bitwise copy of an otherwise !Copy type.

The following code is sound since without calling unsafe code causing undefined behavior is not possible as Token::new_pending is marked as unsound.

pub mod private {
	pub struct Token(bool);

	impl Default for Token {
		fn default() -> Self {
			Self::new_ready()
		}
	}

	impl Token {
		/// # Safety
		///
		/// The caller of this function is responsible for making sure that
		/// `Self::make_ready` is called **before** calling `Self::cleanup`.
		///
		/// Othewise calling `Self::cleanup` will result in undefined behavior.
		///
		pub unsafe fn new_pending() -> Self {
			Self(true)
		}
		pub fn new_ready() -> Self {
			Self(false)
		}
		pub fn make_ready(&mut self) {
			self.0 = false;
		}
		pub fn cleanup(self) {
			if self.0 {
				unsafe {
					// SAFETY: The caller of `Self::new_pending` is responsible 
					// for making sure that this never happens
					std::hint::unreachable_unchecked()
				}
			}
		}
	}
}

We can then create a safe wrapper around Token to allow creating a pending token without risking undefined behavior:

mod pirvate {
	pub struct Token(bool);

	impl Default for Token { .. }

	impl Token { .. }
	
	pub struct Wrapper(Token);

	impl Wrapper {
		pub fn new_pending() -> Self {
			Self(unsafe {
				// SAFETY: this is safe because we don't give public access to `Token`
				Token::new_pending()
			})
		}
		pub fn new_ready() -> Self {
			Self(Token::new_ready())
		}
		pub fn cleanup(mut self) {
			self.0.make_ready();
			self.0.cleanup();
		}
		pub fn get_mut(&mut self) -> &mut Token {
			// `&mut Token` cannot be converted to `Token` since `Token != Copy`
			&mut self.0
		}
	}
}

This relies both on Wrappers field 0 being private and Token not implementing Copy

let mut wrapper = Wrapper::new_pending();

// compile error: "field `0` of struct `private::Wrapper` is private"
let token: Token = wrapper.0.cleanup();

// compile error: `cannot move out of a mutable reference`
wrapper.get_mut().cleanup();
^^^^^^^^^^^^^^^^^^---------
|                 |
|                 value moved due to this method call
move occurs because value has type `private::Token`, which does not implement the `Copy` traitimplement the `Copy` trait

The problem occurs if we mix our api with functions from mem which use unsafe internaly, mainly by calling ptr::read on a mutable reference to Token which on could leak a pending Token thus allowing safe code to cause undefined behavior.

let mut wrapper = Wrapper::new_pending();

// this is ub
std::mem::replace(wrapper.get_mut(), Token::new_ready()).cleanup();

I came up with this example as I was thinking about the soundness of take_mut::take which can cause ub with an even simpler code:

pub mod private2 {
	pub struct Token { _lock: () };

	impl Token {
		pub fn new() -> &'static mut Self {
			unsafe {
				// SAFETY: `Token` is a ZST
				std::ptr::NonNull::dangling().as_mut()
			}
		}
		pub fn cleanup(self) {
			unsafe {
				// SAFETY: There is no way to construct an instance of 
				// `Self` so this code is never run
				std::hint::unreachable_unchecked()
			}
		}
	}
}

use private2::*;

pub fn ub() {
	take_mut::take(Token::new(), |token| {
		token.cleanup();

		loop {}
	});
}

These examples are ment as purely theoretical and don't reflect real world code that should be, only what could be writen.

So the question is whether private is unsound or mem::replace is unsound.

zirconium-n · March 23, 2023, 9:54am

Strictly speaking, soundness is the property of the whole program, not a particular function. But if some part of the program can interact with pure safe code (without unsafe inside) and cause UB, then we can call that part of the program unsound.

In this case, both mod private and mem::replace contains unsafe code inside, and the whole program is unsound. You can choose "freely" which one is bad abstraction. But since mem::replace exists since the beginning of Rust, and the whole Rust ecosystem is built up respecting mem::replace exists. So we choose to reject mod private.

In general. We don't call something in std unsound unless you can produce UB with std only. (We may consider the ecosystem impact when adding new function in std though.)

farnz · March 23, 2023, 11:45am

Writing this out slightly differently, you get:

let mut wrapper = Wrapper::new_pending();

let mut old_token = std::mem::replace(wrapper.get_mut(), Token::new_ready());

old_token.cleanup();

Expressed like this, you can see where the potential for unsoundness comes in - &mut Token can be converted to Token, because that's what std::mem::replace exists to allow you to do - if you can get a Token from somewhere, then you can swap the Token in your hands for the one behind your &mut Token reference.

Thus, the comment in get_mut is wrong, because I can get a new Token, and swap it for the one behind the reference. Since that comment is in the same module as unsafe code, that's where the bug lies.

This is why Pin::get_unchecked_mut is the only unconditionally implemented way to get an &mut T from a Pin<&mut T> - otherwise I could use mem::replace or mem::swap to move the pinned value.

nacaclanga · March 23, 2023, 12:13pm

I feel like the issue lies in the way you define Token. In my opinion calling Token::new_pending() must be save. There are no invariants as to the input parameters that would cause it to engage in UB.

On the other hand, Token::cleanup() should be unsafe. This function may engage in UB depending on the invariant. The safety comment actually describes the invariant the user of this function has to check before calling it in order to avoid UB, this should be written in the functions documentation.

For comparision, it is safe to create dangling pointers, but unsafe to dereference a pointer.

Notice that in you wrapper type you actually tackle both invariants. You make the new_pending safe and avoid the UB casusing input for cleanup by calling make_ready() directly before.

Mokuz · March 23, 2023, 12:25pm

this is assuming mem::replace is sound. Because it is implemented using unsafe code internaly it isn't inherently sound. After all the above code can be rewriten as such:

let mut wrapper = Wrapper::new_pending();

let old_token = unsafe {
    let temp = std::ptr::read(wrapper.get_mut());
    ptr::write(wrapper.get_mut(), Token::new_ready());
    temp
};

old_token.cleanup(); // ub

here we use unsafe code and have reached ub, but all we have done is inlined the mem::replace function, and there is nothing that explicitly states that this kind of unsafe code must be sound, so private assumes that it isn't.

crate some_other_crate_foo {
    fn copy_t(t: &T) -> T {
        unsafe { std::ptr::read(t) }
    }
}

// user code
let mut wrapper = Wrapper::new_pending();

some_other_crate_foo::copy_t(wrapper.get_mut()).cleanup();

Here in user code we don't call any unsafe functions yet our code has ub, so either some_other_crate_foo or private have ub, its just that in this case copy_t is obviously unsound, but only when called with types that are !Copy. The same applies to mem::replace it is safe assuming the type it is called on doesn't expect not being copied like Token does.

RalfJung · March 23, 2023, 12:34pm

This is not a valid assumption. !Copy just means that Token cannot be duplicated, but it can still be moved. And Rust never intended to prevent moving data behind &mut references.

There might exist hypothetical alternative universes where !Copy data behind &mut cannot be moved. In that alternative Rust, indeed some of the mem functions are unsound. But the real Rust we have today is not like that. This is one of these regularly occurring cases where the language has to make a choice to allow either one or the other library to be sound, but not both -- and Rust very clearly decided that mem is sound, and hence your assumption quoted above is unsound. (Another one of these cases, famously, is leakpocalypse: the pre-1.0 scoped thread API that relied on sound code not having memory leaks, vs Rc which allows memory leaks in safe code.)

Put differently: for all intents and purposes, mem::{replace,swap,take} are language primitives. The fact that they are actually in libcore and implemented via unsafe code is an implementation detail.

dlight · March 23, 2023, 4:03pm

In this case, Token is fine, it's Wrapper that is unsound. But this just means that Wrapper can't return a &mut Token, right?

There are other unsafe abstractions in Rust that have similar restrictions on not being able to return a reference to the wrapped type. For example we can't turn a &Cell<T> into a &T. So that is not too unusual.

farnz · March 23, 2023, 4:07pm

The reason for suspecting mod private over core::mem::replace is that the broken assumption lives in mod private; none of the assumptions that core::mem makes are broken.

Given that all of core::mem's assumptions (including those in replace) are valid in this code, but some of the assumptions that your unsafe code depends on are not valid, locality suggests that it's more likely that your code is assuming something that's not actually guaranteed by the rest of the Rust language, and that's broken by something else.

kpreid · March 23, 2023, 5:44pm

An observation: I think Wrapper would be sound if it offered only pin-projected &mut instead of bare &mut:

pub fn get_mut(self: Pin<&mut Self>) -> Pin<&mut Token> {
    unsafe { Pin::new_unchecked(&mut self.0) }
}

This is because the guarantees of Pin include that the value will not be moved until it is dropped; therefore, for any operation available on Pin, either

the Token is not moved out of Wrapper, so it is not possible to call Token::cleanup which requires moving, or
the Token is dropped, so nothing else will be done with it.

I would not recommend doing this, however, because it isn't exactly what Pin is meant for, and there might be surprises I haven't thought of. (I had a similar such surprise in the past, when I had an idea that Pin might prohibit any kind of replacement of the value, but in fact Pin::set() exists and is sound because the Pin contract allows dropping.)

Additionally, doing this is only interesting if Token offers some Pin<&mut Self> methods (otherwise you might as well leave off get_mut() entirely and offer only an immutable get()) and if those methods are sound to call under the rules enforced by Wrapper.

SkiFire13 · March 23, 2023, 7:11pm

Also, Token needs to not implement Unpin (which it does by default), otherwise you can get an &mut Token out of a Pin<&mut Token>

dhm · March 27, 2023, 9:17pm

For what is worth, I agree there may be a legitimate need for the OP's use case.

For that, there are two options:

1. Expose some `TokenMut<'_>` type instead of `&'_ mut Token`,

precisely to avoid the &'_ mut API with mem::replace() and whatnot.

Note that often you can make type TokenMut<'__> = Pin<&'__ mut Token>; play that role (provided that Token not be Unpin, of course).

In your case, for instance, this could involve having a .get_pin_mut() accessor instead of the "raw" get_mut() one (c.f., pin_project - Rust).

Brand the `Token` type so that two owned instances never have exactly the same type:

pub
struct Token<'brand> {
    // ...
    _brand: PhantomInvariant<&'brand ()>,
}

impl Token<'_> {
    pub
    fn with_new<R>(scope: impl FnOnce(Token<'_>) -> R)
      -> R
    {
        let yield_ = scope;
        yield_(Token { ..., _brand: <_>::default() })
    }

    /* you could also offer a macro-constructor */
}

with type PhantomInvariant<T> = PhantomData<fn(T) -> T>;

Now, the scoped APIs are indeed quite annoying to use in practice; because they don't play well with ownership, given that we are often used to owned stuff being : 'static, which a branded type won't be.

On the other hand, for &mut-temporary access, such as with your get_mut() example, a scoped APIs comes at a very lightweight ergonomic cost:

Optionally brand Token:

type Brand<'id> = &'id();
type Unbranded = Brand<'static>;

// By default, `Token` will be usable as a standalone and `: 'static` type
pub //            👇👇👇👇👇
struct Token<Brand = Unbranded> {
    ...
    _brand: PhantomInvariant<Brand>,
}

/* All the `-> Self` constructors must only return unbranded tokens! */

// but make sure to keep, in your crate, the `impl ... Token` generic over `<'brand>`.
impl<'id> Token<Brand<'id>> {

Make the mut accessor of Wrapper yield a 'brand-ed Token:

impl Wrapper {
    /* Remember: no non-unsafe `get_mut()` ! */

    pub
    fn with_token<R>(&mut self, yield_: impl FnOnce(&mut Token<Brand<'_>>) -> R)
      -> R
    {
        yield_(&mut self.0)
    }
}

Usage:

let mut wrapper = Wrapper::new...();
wrapper.with_token(|t| {
    /* you can use `t: &mut Token<_>` here */
});

Playground

withoutboats · March 29, 2023, 1:00pm

(NOT A CONTRIBUTION)

Ralf's point is spot on, but I just wanted to add that this specific decision (that you can move out of a mutable reference) has already been a case where we could not make other desirable APIs sound. It is exactly why we had to add Pin to encode the contract of a mutable reference that you cannot move out of. This was the problem with self-references: mutable references had been made "too powerful." And so suggestions to use Pin for this are probably on point.

Rust has committed to the soundness of certain APIs by making them available on stable, and any new APIs anyone wants to add must be compatible with them. Safe stable std APIs simply cannot be unsound unless they conflict with another safe stable std API or a language feature - they define what is sound and what is not.

T4r4sB · April 4, 2023, 5:33pm

I know the simpler example without std::mem::take:

pub fn get_dangling() -> &'static mut i32 {
 unsafe {
  std::ptr::NonNull::dangling().as_mut()
 }
}
...
let a = get_dangling();
let b = *a;

UB in safe code!

I want to say: if you have unsafe section in your code, you should be very carefull. If you broke some rules inside unsafe block, the program will have ub outside this block.

dlight · April 4, 2023, 5:38pm

But get_dangling is unsound, right?

T4r4sB · April 4, 2023, 6:30pm

Yes, of course. And topic-starter post has unsound code in unsafe sections.

Mokuz · April 5, 2023, 5:35am

I want to say: if you have unsafe section in your code, you should be very carefull. If you broke some rules inside unsafe block, the program will have ub outside this block.

I know but that wasn't the point, because the following code is completely sound:

pub fn get_dangling() -> &'static mut () {
    unsafe {
        std::ptr::NonNull::dangling().as_mut()
    }
}

unlike get_dangling which is always unsound.

And my example is sound assuming you cannot get a Token from &mut Token, and my point was that you can only do so by using unsafe code.

We can even rewrite Token::new to not use unsafe code

pub mod private2 {
	pub struct Token { _lock: () };

	impl Token {
		pub fn new() -> &'static mut Self {
			Box::leak(Box::new(Token { _lock: () }))			
		}
		pub fn cleanup(self) {
			unsafe {
				// SAFETY: There is no way to construct an instance of 
				// `Self` so this code is never run
				std::hint::unreachable_unchecked()
			}
		}
	}
}

T4r4sB · April 5, 2023, 7:57am

Anyway, Token::cleanup has wrong assumption. And it is your assumption, that "pending" case will never run, but it is wrong. You made cleanup method public, and you gave public mutable access to the field, so your code violate language rules. And any &mut T can produce T, you should remind about it.

Mokuz · April 5, 2023, 10:01am

yes

farnz · April 5, 2023, 10:06am

This is misleading as written; both core::mem::replace and core::mem::take can be used from safe code, and have been designed so that their safety preconditions are always met by safe code. Thus, if I assume that core, alloc or std is available, then this claim is false.

If you're saying that the implementation of these functions uses unsafe code, then your reasoning becomes dangerous to follow; the implementation of process start-up uses unsafe code, so all safe code ultimately uses unsafe code. And plenty of parts of the standard library also use unsafe code - Vec does, for example, so asserting that any use of unsafe code by an implementation makes it "unsafe code" for the purposes of reasoning about safety means that I can't (for example) use much of the standard library. Box is even worse - not only is it not implemented fully in terms of safe code, it's actually implemented by the compiler directly.

A more accurate phrasing of your safety precondition is "if you avoid using core, alloc or std crates, then there is no way to get a Token from &mut Token without using unsafe code". And while you could in theory have a Rust dialect that doesn't use core at all, right now, core is required because it implements language items that the compiler depends upon, and is thus a core part of the Rust language.

Mokuz · April 5, 2023, 10:39am

I'll admit that at this point I'm just being pedantic.

My point is that what is sound to do in rust is not same as what is done in std, core or alloc. (while assuming so is almost always the right thing to do)

For example dereferencing a reference is not safe because its done inside std, its safe because its explicitly stated as being always safe to do so.

But as far as I know there is no language rule saying you can go from &mut T to T (ofc there is no rule saying you can assume it not to happen so my example is is unsound in that regard)

Topic		Replies	Views
Replace_with RFC Unsafe Code Guidelines	4	1508	March 25, 2019
You_can::turn_off_the_borrow_checker should not be allowed without declaring unsafe Unsafe Code Guidelines	13	2811	January 24, 2022
Less Unsafe	9	2263	March 25, 2019
Is it possible to be memory-safe with deallocated `*self`? Unsafe Code Guidelines	20	2966	March 25, 2019
Unsoundness in `Pin` language design	113	41284	April 9, 2020

Are `mem::{replace, swap, take}` unsound?

1. Expose some TokenMut<'_> type instead of &'_ mut Token,

Brand the Token type so that two owned instances never have exactly the same type:

Playground

Related Topics

1. Expose some `TokenMut<'_>` type instead of `&'_ mut Token`,

Brand the `Token` type so that two owned instances never have exactly the same type: