[Idea] 'Immutable' marker trait

I've run into a use case where I need to guarantee that an object is not mutated during its lifetime after it's created. This is beyond merely putting it behind a & or in an Arc as there is always the possibility of interior mutability gumming up the works. Basically, I need something like const, but which has a non-'static lifetime, which is also recursive through everything that the object owns for the duration of its lifetime. I thought about things like OnceCell, but as its own documentation shows, it still permits interior mutability. I want to prevent that from happening at all.

This made me think about something like Pin and Unpin that would declare that the object, once created, was permanently immutable for its lifetime. Let's call this marker trait std::marker::Mutable. If an object is marked Mutable, or any object it contains (including by reference) is Mutable, then it is also marked as Mutable. If the compiler cannot determine if an object is mutable or not, then by default it is Mutable. Whatever is left over is immutable. Note that all of this done at compile time, not at runtime, which is why you can have code that the compiler can't prove is immutable, which is why it gets the Mutable marker.

So, why bother? My hope is that the compiler team could take this information and use it to produce faster/better code. If you know for certain that something and all of its contents are const for a given lifetime, can you (meaning the compiler team) produce faster/better code?

Unsafe problems

The biggest issue is how unsafe code is dealt with. In theory, someone can transmute a usize into a pointer to anywhere in memory, including to things that are supposedly immutable, and thereafter mutate it. My vote is to ignore this issue as unsafe is unsafe for a reason; you're expected to know what you're doing, including knowing that if you mutate stuff that is supposed to be immutable, then you're in UB territory.

So, thoughts? I'm mainly interested in thoughts from the compiler team as the idea for this is driven by a need for speed. If it isn't likely to be useful in speeding up code, then there really isn't much point in pursuing it further.

I don't see how/why this would be useful. The compiler already knows the contents of all types, and has the information as to whether any type or field recursively contains interior mutability. (It sees through even private fields, mind you.) If something doesn't contain an interior mutability primitive, then it can optimize based on the assumption that it doesn't change behind immutable references.

1 Like

There already is (internally) the Freeze trait

2 Likes

Can you elaborate on why you have this need?

That would help clarify whether it needs physical immutability or conceptual immutability -- would it be ok if what's behind it was memoized, for example?

What about trait objects?

I'm unfamiliar with that, where is it documented?

I'm not sure what you really mean by physical vs. conceptual immutability; I'm guessing that the former means marking the page in memory as being read-only, while the latter means that the rust abstract machine1 knows that the variable is immutable. If so, then I'm talking about the latter idea.

I want to create an immutable tree-like data structure, kind of like how im works (I have some changes I want to make which is why I'm not using im directly). To make it generally useful, I'm making it generic across both the key and value types, which only need to implement the minimum traits required for their roles. The issue is that I can't currently prevent someone from passing in keys or values that implement those traits AND have interior mutability. The closest I can come to this is to put in lots and lots of doc comments about how that's a really, really bad idea.

Now, that's probably enough for my own use case, but that was just the thing that got me thinking about the idea of a marker trait. The rest was when I realized that there might be performance gains to be had if the compiler was able to use the information, which is why I created this topic, to gauge if there could be any utility to the idea. Is there any? From what @H2CO3 said earlier, I'm guessing that the utility is limited, in which case we can drop the idea, but if it would make code faster...

1I'm aware that there is no such thing at the current time, but this is the closest analogy that I can think of to what I think you mean.

The difference between whether the representation of the value cannot change, or whether the representation can change but in a way that represents the same value.

Note that that's the approach that HashMap takes. It's a really, really bad idea to have keys that change while they're in the HashMap, but nothing stops you from making a HashSet<Cell<i32>>.

4 Likes

Well, I don't know, what about them? I don't think the compiler/optimizer needs to treat them specially wrt. mutability.

That... is an interesting and difficult question that I hadn't thought about before. I can only think of one use case where mutating the instance, even with an identical bit pattern, can cause issues. Consider a memory mapped physical device of some kind in an embedded system. Reading values out of the address range is fine at any time, but writing to the address range, even writing an identical bit pattern to the same range, can trigger the device to activate in some manner. So, you transmute() the address range into an immutable type, simply to ensure that there is no way to write that address while the object is alive. Now you have a quasi-mutex guard that is compiler checked.

I'll be the first to admit that example is real stretch, and that there are other options available, it was just my attempt at thinking out loud about the original question, exploring what immutable really needs to be... It's a good question, I really like it!

Yeah, for my current use-case, this is how I'm going to handle the issue. If the end user wants to shoot themself in the foot, that's their issue, at least I've warned them of the potential issue.

That's what I wanted to find out from those that are more knowledgeable than me. I don't know enough to know if something like an immutable marker trait would allow further optimizations. If it really can't improve code generation, then there really isn't any point to it.

Interior mutability isn't the only way to have inconsistent Hash/PartialEq/Eq/etc results - types can also access globals. So I don't think it would help for your use case anyway.

3 Likes

Can you please give me an example of how this would work?

struct Bad;
impl PartialEq for Bad {
    fn eq(&self, other: &Self) -> bool {
        rand::thread_rng().gen()
    }
}
impl Hash for Bad {
    fn hash<H: Hasher>(&self, state: &mut H) {
        static COUNTER: AtomicU8 = AtomicU8::new(0);
        state.write_u8(COUNTER.fetch_add(1, atomic::Ordering::Relaxed));
    }
}
2 Likes

OK, I see what you're saying. And you're right, simply marking something as being immutable (or mutable) isn't perfect. However, if I choose to implement Send or Sync on a struct manually while violating their contracts, I'll have the same issues. So I'd put this down as user error more than anything else.

Note that Send/Sync are unsafe, so implementing them incorrectly is a violation of soundness. Pure (no observable state changes without &mut) could also be an unsafe trait, but it couldn't be an auto trait, because any type could write a method that accesses the global environment. This would effectively mean every type has to unsafely decide to opt-in impl Pure, or a much more sophisticated auto trait pass. And thus also breaking semver accidentally when losing the auto trait implicitly.

And then there's also the question of what is the scope of Pure? It can't cover trait impls, because those can be added by downstream crates, irrespective of whether upstream implemented Pure, and access global state, breaking the promises of Pure!

So Pure would only apply to "upstream trait impls" which... is not a concept relevant to safety currently, and while I think it could maybe work (as coherence rules might prevent bad impls from being used[1]), I think it's way to subtle.

I think what you need is to either just eat the cost and be resilient enough to misbehaving impls to maintain safety, or make your own unsafe trait a la TrustedExactIterator which is just a marker that specifies that specifically how you want to use the types is well behaved, and eat the cost/benefit tradeoff[2] of that.

[1] It's not possible for a child crate using a lib with a Trait + Pure bound to add impl Trait to any type which might have an exploitable Pure impl. The only place to mess up for a given Trait I believe would be in the crate that defines the Trait, and I think that would maybe be on them? Though they could still implement it themselves on an upstream type with Pure in a nonpure way, so... maybe not.

[2] A "witness type" can be used to remove the need for a wrapper type; have a unit/phantom type parameter whose only purpose is to implement IKnowThatThisTypeIsWellBehaved<T>. You can then also have a IKTTTIWB<Self> convenience constructor, and constructor that takes a proof argument.

1 Like

Note that Hash and Eq traits don't give you consistency as a guarantee. Your code is not allowed to do anything unsafe even in presence of actively evil Hash or Eq implementations.

There's Rudra project that focuses on detecting unsafety bugs caused by unsafe code trusting safe traits too much.

4 Likes

(Nit: Freeze is about immutability within the very first (no-indirection) layer of data. For instance, &'static Cell<u8> : Freeze does hold . Indeed, it's a marker trait that exists to answer the direct question of: can I store a value of type T in read-only memory by virtue of only lending shared access to it? Which is a very legitimate question for static s, since indeed they only lend shared access, and since the compiler can tell the linker to emit a static in, for instance, .rodata .)


The OP here seems to want to to cover all indirections so that &'static Cell<u8> : !ImmutableIfShared. And the answer to that is that some forms of indirections are not necessarily structurally present: mainly, globals.

That being said, while unsafe code could not be able to rely on "purity" of such a type and of its by-shared-ref operations, it could still be a nice thing to "nudge" users into fixing their APIs, similarly to how Borrow nudges users to pay more attention to their implementation when compared to AsRef.

For that "lint", we can thus use a structural trait / auto-trait approach, although only on nightly since such traits are feature-gated and unlikely to get stabilized:

pub auto trait ImmutableIfShared {}
impl<T : ?Sized> !ImmutableIfShared for UnsafeCell<T> {}
// EDIT: Added these as well, since raw pointers also
// act as a "`&`-barrier", like `UnsafeCell` does (as @H2CO3 suggested)
impl<T : ?Sized> !ImmutableIfShared for *mut T {}
impl<T : ?Sized> !ImmutableIfShared for *const T {}

and that's it! 1 You can now use the safe / not-trustworthy trait ImmutableIfShared for whatever you want (note that then types such as Rc<u8> won't be ImmutableIfShared even though the data they ref-count will be immutable).

1 (you may add a positive impl for PhantomData<T> and [T; 0])

2 Likes

Nit: The auto trait should also be unsafe to prevent invalid implementations

You probably want to add a negative impl for raw pointers as well, since they too can be used for interior mutability (primarily in FFI, for informing Rust that C can and does change arbitrary things behind pointers).

1 Like

I think that would work... I'm still getting hung up on the wording, the IfShared suffix keeps making my brain jump to the contrapositive ("if not shared" -> "mutable"). I want to make sure that even if the owner never shares the object (no references, no pointers, nothing), then once the object is created it can't be mutated during its lifetime. That said, based on the prior discussion, I'm not sure if the compiler could figure this out or not.

Beyond that comes the question of utility; I don't want to suggest something become a permanent part of rust (even on nightly) unless I'm fairly certain that a largish chunk of the user base could really use it. I don't think this could lock memory address ranges as written, so I don't know how useful this would be to embedded systems and OS writers (being able to set some bytes in a given address range, and then 'lock' them for some time could be a very useful lint for some purposes). I also don't know if this would permit the compiler to emit better faster code. So, given that (as @dhm showed) this can be put into nightly, would it be worth the effort? Does anyone think it would be useful to add to rust?