Autotrait to mark types without interior mutability

So I've being thinking about making cheap copies for types that are not Copy.

It appears to me that if I create a bitwise copy of a value, ensure that no copy (including original value) is mutated and only one of the copies is dropped, it should be totally ok, valid and safe. No matter what the type is and what constraints there are. Because semantically accessing copies would be no different to moving original value around. Correct me if I wrong.

And this is possible with some unsafe code.

If I want to make a library that uses the trick and with nice and safe API then a wrapper of some kind would have to take the value by mutable reference or assume ownership and then give only shared references to identical copies. But in the presence of interior mutability it becomes unsound. If some bits of one copy get changed, it won't be reflected in others, making it possible to break some internal constraints. Thus I could add a type bound that restricts safe API only to types that do not have interior mutability, i.e. no UnsafeCell among the fields. Such trait marker could be implemented similarly to other unsafe markers like Send and Sync, be auto-implemented for all types without fields that do not implement it and explicitly not implemented for UnsafeCell.

The only other problem I've imaged is that two pointers with different addresses to a value that represents some resource may be considered to point to different resources by some unsafe code.

1 Like

Neat rationale – "if I access the copies in some sequence, it's indistinguishable from if I moved to the original value around in the same order as the accesses, right? We can view the other copies of the value as being merely bitwise identical but not 'real' in some sense."

But I'm pretty sure it overlooks at least one case. Suppose you have Box<Foo> where Foo is Send but not Sync. (That normally happens with interior mutability, but hypothetically Foo could be something else, like a handle to a resource that uses FFI for something that's only allowed from one thread at a time). Now you can copy the box and move the copies into 2 different threads at once, violating the !Sync guarantees.

2 Likes

First, this is broken in the presence of raw pointers too.

Second, I'm not really convinced by your argument (I can't explain why, but I don't feel good about that).

Third, it is not even decided yet whether you're allowed to access only the copied from object (and it is not invalidated), not to mention both it and the new object (What about: use-after-move and (maybe) use-after-drop · Issue #188 · rust-lang/unsafe-code-guidelines · GitHub), and you should avoid doing that in the meantime:

Fourth, I think the solution is to (eventually) stabilize auto traits, not create an auto trait in std for each use.

Fifth, this trait already exists (compiler-internal): Freeze.

4 Likes

How exactly? In copies pointers would point to the same place. It won't work with Unique<T> (and so with the Box<T>) though, because accessing two copies of Unique<T> at the same time would cause problems. But for raw pointers it should be OK.

I guess you can break my argument in multithreaded environment.

I'm talking about copies created with ptr::read or ptr::copy, not by lang's move operation. So no uninitialization could happen.

That would be too good.

It can be easier to convince lang teams to make Freeze public than creating the trait in the first place. Even if my current use-case is more restricted than I though originally (i.e. it would require Freeze + Copy + Send to allow sending copies to other threads).

Because they can also be used to mutate. They are, in fact, exterior mutability. Relying on them to not may (and will) break the invariants of types.

This is also discussed in the abovementioned issue.

First, Freeze doesn't restrict raw pointers.

Second, I don't really think the lang/libraries teams will be happy to stabilize Freeze (quite the contrary), but you can try.

This is a somewhat common pattern:

There are other cases where a type can reference memory addresses beyond its own bounds in unusual ways, that wouldn’t work if the object is copied to a different address:

  • A struct that’s expected to only ever exist as a specific field of another, larger struct, so unsafe code expects it can go from one to the other by subtracting an offset from the pointer (search for “container_of”).
  • XOR linked lists.
  • Relative pointers. (Ironically, these are usually used as a way to ensure a type can be moved just by copying bytes, assuming the relative pointer points to another part of the same allocation, but it could still be the case that that other part is outside of the object you’re copying. Alternately, a relative pointer might point to some other allocation entirely, which would defeat the usual purpose of using relative pointers, but there might be other reasons for their use.)

In many cases, though not all, these patterns are also incompatible with strict provenance and/or subobject slicing. But there’s a reason there was a backlash against strict provenance: plenty of real code does not follow it.

1 Like

Arguably, you might even want to have a weaker condition for your use-case: only ruling out interior mutability that isn't behind at least one indirection. E. g. Cell<u8> is problematic but Box<Cell<u8>> should be fine again.

I've wanted a feature along these lines as well, but for different use cases.

The main one would be the ability to ensure a value is truly immutable.

Something like this:

use std::str::{self, Utf8Error};

pub struct Foo<T: AsRef<[u8]> + NoInteriorMutability> {
    inner: T
}

impl Foo<T>
where
    T: AsRef<[u8]> + NoInteriorMutability
{
    /// Precheck that T represents a valid string
    pub fn new(storage: T) -> Result<Self, Utf8Error> {
        // Ensure storage contains valid UTF-8
        str::from_utf8(storage.as_ref())?;
        Ok(Self { inner: storage })
    }
}

impl AsRef<str> for Foo<T>
where
    T: AsRef<[u8]> + NoInteriorMutability
{
    fn as_ref(&self) -> &str {
        // Safety: checked to be valid UTF-8 when constructed.
        // Using unsafe ensures this function is panic-free.
        unsafe { str::from_utf8_unchecked(self.inner.as_ref()) }
    }
}

With guaranteed immutability values can be checked in advance that a given conversion is safe, eliminating a potential panic when trying to perform conversions.

3 Likes

Though that specific example is easily exploited, you need a more stringent promise like StableDeref provides

static STR: &str = "foo";
static NOT_STR: &[u8] = &[0x00, 0xff];

struct Foo;
impl NoInteriorMutability for Foo {}
impl AsRef<[u8]> for Foo {
  fn as_ref(&self) -> &[u8] {
    if rand() { STR.as_bytes() } else { NOT_STR }
  }
}

If value behind a pointer is changed, invariants are not broken since pointer itself is unchanged, so copy points to the same place and changes is visible to it. So raw pointers would be permitted, as well as references and NonNull<T>. But not Unique<T>, Hence Freeze is implemented in exactly the way I'd like, except for Unique.

But I understand that some unsafe code may rely on the fact that there's no immutable copies of the value. i.e. there's only one safely accessible address where value is stored. Which is not and should not be covered by Freeze.

Exactly. That's why I said that all kind pointers should implement it. Except I didn't think of Unique<T> at the time.

Code you describe requires that value is not moved either. Which is totally safe op unless Pin is involved. Which means that I should not copy pinned values. Which is OK, since I was going to require mutable reference or ownership of the value to perform the copies, and that's definitely means it isn't pinned. Unsafe code that takes &mut T out of Pin<&mut T> should know better than put it into copy machine :slight_smile:

While Unique<T> is a lang item and has special SB treatment, it still can be viewed as just a pointer. And no one can, or should, prevent unsafe code from making the same invariant, and your code will allow people to break that.

Copying values is such a fundamental operation that unsafe code relies on invariants regarding it a lot; that is why we have Copy. Copying things that are not Copy, that is, were never meant to be copied and moreover, many times they were meant to not be copied, is going to break a lot of things. If you want a trait for "can be safely duplicated", well, it has the name Copy.

Now there can (though I doubt that, since Copy is that essential) a type that is mistakenly not Copy and you cannot fix it for some reason, make a newtype and unsafely copy it, that's fine (assuming we suppose that's fine, see the above UCG issue). But doing that generically is going to hurt you.