[pre-RFC] Remove static mut

Summary

Remove static mut and replace uses with static in combination with interior mutability, such as atomics, or UnsafeCell directly.

Motivation

Ever since static gained the ability to contain interior mutability, which would place it into writable static memory, as opposed to always read-only, static mut has become redundant with it.

There is also an inconsistency around &mut borrows, they are never allowed in a global, except for &mut [...] in static mut, to support holding an array in a static mut without specifying the length in the type.

Last, but not least, static mut is an unsafety trap. It is too easy to end up with aliasing &mut borrows from it or have unsynchronized reads/writes in a multi-threaded applications, both of which are undefined behavior. Keeping a feature that is more unsafe than one would initially assume, and which we see people abuse all the time, could turn out to a mistake in the long run, and no doubt some may even call it an irresponsible move.

Detailed design

Remove static mut and require all users to move to static combined with interior mutability:

static mut FOO: Foo = Foo { x: 0, y: 1 };
/* very */ unsafe { FOO.x += FOO.y; }

// becomes (first approximation)

// somewhere in libstd:
struct RacyCell<T>(pub UnsafeCell<T>);
unsafe impl<T> Sync for RacyCell<T> {}
impl<T> RacyCell<T> { pub fn get(&self) -> *T { self.0.get() } }
macro_rules! racy_cell {
    ($x:expr) => (RacyCell(UnsafeCell { value: $x }))
}

static FOO: RacyCell<Foo> = racy_cell!(Foo { x: 0, y: 1 });
unsafe {
    (*FOO.get()).x += (*FOO.get()).y;
}

In some cases, the replacement is entirely safe:

static mut COUNTER: usize = 0;
unsafe { COUNTER += 1; }
// can be replaced with:
static COUNTER: AtomicUsize = ATOMIC_USIZE_INIT;
COUNTER.fetch_add(1, Ordering::SeqCst);

There are also designs of containers which are unlocked with an entirely safe and zero-cost proof that interrupts have been disabled, being used in kernels written in Rust. It works something like this:

struct InterruptGuarantor<'a> { marker: InvariantLifetime<'a> }
struct InterruptLocked<T> { value: T }
unsafe impl<T> Sync for InterruptLocked<T> {}
impl<'a, T> Index<InterruptGuarantor<'a>> InterruptLocked<T> {
    fn index<'b>(&'b self, _: &InterruptGuarantor<'a>) -> &'b T {
        &self.value
    }
}

static COUNTER: InterruptLocked<Cell<usize>> = InterruptLocked {
    value: CELL_USIZE_ZERO // pretending this exists
};
// assembly could call this function as if it had no arguments:
fn timer_interrupt_handler(guarantor: InterruptGuarantor) {
    COUNTER[guarantor].set(COUNTER[guarantor].get() + 1);
}

Drawbacks

Certain cases would be more verbose, though this would force users to consider moving away from globals where possible, and prevent new uses out of habit, which tend to become (more or less) subtly unsafe.

Initializing globals is still problematic, as we don’t have an UFCS story. As such, we can’t use Mutex or RefCell (the latter in a #[thread_local] static only) at all right now. However, this isn’t a downgrade from the status quo, just an impediment in making even more usecases safe.

Alternatives

Don’t do anything, keep static mut in 1.0, alongside const and static.

Feature-gate static mut to allow its removal after 1.0.

Unresolved questions

We need a better story for creating values of encapsulated types. This would allow static globals to contain atomics with arbitrary initial values, like Mutex, RwLock - or in the case of #[thread_local] (or other abstractions like the interrupt-based one above) - Cell and RefCell.

Could we have a simplistic CTFE system with const functions and leave the trait/impl interactions to be resolved at a later time? Associated constants could also provide similar functionality, albeit less clean and compact. I should point out that this can be implemented rather easily, as long as they are not true type-level constants, e.g. [u8; pow(2, 12)] is disallowed.

6 Likes

What are you doing about the array-pointer inconsistency? Does *mut [T] Work? Otherwise sounds excellent. Hope to see this, and a better global initialization story, soon. (Finally the static locks can contain a value!)

Ah, yes, I mentioned that for a reason and then proceeded to forget about it when I realized I have no immediate solution. &racy_cell!([a, b, c]) as &RacyCell<[T]> would work if it weren’t for the fact that you can’t take a reference to a value with interior mutability. We’d have to either lift that restriction inside static (unlike const, it can only mean one thing), or allow static to be unsized (in this case, RacyCell<[T]>), which is okay since it’s always an lvalue.

What do you mean?

Hmm, so the static is still defined fix size, and thus no extra indirection? I like this--abstraction for privacy rather than dynamism (like opaque type synonym I also want). Reminds me how I wanted to hide dimensions in QuiltOS/src/arch/x86/vga.rs at master · QuiltOS/QuiltOS · GitHub

Also, do we need RacyCell, or could accessing statics that don’t impl Sync just be unsafe?

I mean that &UnsafeCell { value: () } for example is not valid inside const, static or static mut, only inside functions.

Well, the interface is already unsafe even with just UnsafeCell - actually, in my original version I had just UnsafeCell, but it was since made non-Sync and for a while there was a RacyCell (which seems to have been removed already) as a Sync wrapper around UnsafeCell, so I replicated that. There’s room for improvement here, I’m sure.

Oh, in the static/const itself, duh.

Oh yeah good point, I was thinking taking a reference to the UnsafeCell itself would also be unsafe, but that is not necessary. Maybe the Sync requirement can just be dropped altogether? Implementing Sync doesn’t make the encapsulation of an unsafe cell any safer, just declares ones confidence in that abstraction. (There could probably be a lint about missing unsafe impl Syncs, in general, but that is another discussion.)

The requirement is there because of types like Cell which aren’t safe in a static, and missing Sync is the only difference between those and types which can be safely shared between threads.

Hmm, that’s awkward. Cell, RefCell, and Rc are all weird like that. I wonder if something like the ST monad could keep them in check.

Essentially the only legit use case for static mut that I’m aware of is modeling an extern C global. I suspect this could be done by just using an UnsafeCell on the Rust side – the memory layout is compatible after all – but it’s less direct.

eddyb points out that I’m sort of conflating “extern” globals and globals declared in Rust, which is true. I think I’m fine with removing static mut overall.

Links for those like me struggling to follow:

One thing to note: UnsafeCell says it should only be used with mut statics to avoid trying to mutate invariant memory. How instead then will a static item be placed in read-write memory?

From my point of view, “mutable statics” are weird enough that I would want to hide them behind some access function anyway, so the main consequence of this proposal would be to make ugly code slightly uglier.

Oh wow, those docs are very wrong. If a static has interior mutability (i.e. it contains UnsafeCell), it will be placed in read-write memory (automatically). The only restriction is that the type is Sync (hence the use of RacyCell).

FIx is in the queue, though:

https://github.com/rust-lang/rust/pull/21663

Just to clarify (as someone who just found this discussion) would this horrific mess still work with the new system?

(lidt2() is

lidt idtDesc
ret

which is why idtDesc is public)

You can use inline assembly to handle that - and then you don’t need any globals at all.

If you want to keep using globals (or you have other places where you need them), an “interrupts disabled” lock around a static should also work and it can be a newtype around the type (which would bave to contain Cell or RefCell I guess - but the former is 0 overhead).

However, this pre-RFC needs to be updated to make use of const fn and change from “remove” to “deprecate”, sadly, as it missed the mark (unless the core team thinks otherwise). And in my absence, someone else would have to take over for it to stand a chance (cc @nikomatsakis).

I can see how the cell types will be sufficient for almost all of my use cases ( if I roll my own Mutex), especially if the proposed -Z abort-on-panic is implemented.

However, I’m still uncertain about how I would replace #[no_mangle] public static mut idtDesc with a cell type. The problem with the interrupt table descriptor is that the instruction for loading it has one form, which takes a fixed memory address. This means that the linker has to know where idtDesc will be at link time in order to generate the necessary code.

Can’t you just use the address of the cell, since the memory layout is compatible?

You’re right, that actually does work!

I was pleasantly surprised that my Notex type (it has a mutex-like API, but no thread safety) had no memory overhead. (current state of the IDT code for reference).

Well, there goes one use case for static mut, assuming the compiler stays sufficiently smart.