Pre-RFC: Deprecate then remove `static mut`

dyslexicsteak · December 23, 2023, 8:37am

Summary

Deprecate usage of static mut for the 2024 edition of Rust, directing users to switch to interior mutability with subsequent removal of the syntax entirely in the 2027 edition. (This is not pertinent to &'static mut)

Motivation

The existing static mut feature is difficult to use correctly (it's trivial to obtain aliasing exclusive references or encounter UB due to unsynchronised accesses to variables declared with static mut) and is becoming redundant due to the expansion of the interior mutability ecosystem which easily replace static mut.

Guide-level explanation

static mut is meant to provide statics that can be modified after their initial value is set; variables declared with static mut can prove quite problematic when used, however:

static mut X: i32 = 0;

fn main() {
  let a = unsafe { &mut X };
  let b = unsafe { &mut X };

  println!("{a} {b}");
}

Recall Rust's borrowing rules:

At any given time, you can have either one mutable reference or any number of immutable references.
References must always be valid.

The first rule is violated as we have 2 exclusive (mutable) references to the same datum at the same time and are actively using them in an entirely overlapping fashion. This violation means that our code's behaviour is undefined, which means the optimiser is free to do with it as it wishes, potentially breaking it. The code is not guaranteed to print "0 0" and may fail to do so under some circumstances.

static mut also allows for unsyncronised access across multiple threads which can cause data races which are also undefined behaviour:

use std::thread::spawn;

static mut X: usize = 0;

const N: usize = 16;

fn main() {
    let mut thread_pool = Vec::with_capacity(N);
    
    for i in 0..N {
        thread_pool.push(spawn(move || {
            unsafe {
                X = i;
            }
            println!("{}", unsafe { FOO });
        }));
    }

    for thread in thread_pool {
        thread.join().unwrap();
    }
}

Here, since the usize is not an atomic (with predictable and defined relative ordering) nor synchronised with a Mutex or RwLock a data race takes place, printing numbers between 0 and 16 in a vaguely increasing fashion. This is undefined behaviour and means that our code is not correct. This and the previous example show UB that is almost trivial to cause which makes it prone to occur by accident in a large codebase.

Let's try to use static mut for FFI purposes (a common application of it), this is usually achieved in this fashion:

// using a symbol exported by C code
extern "C" { static mut _c_symbol: Ty; }

// exporting a symbol from rust code for use by C code
#[no_mangle]
pub static mut _rust_symbol: Ty = val;

This puts our code at risk of causing UB on access as we saw before. Accesses to static mut can become difficult to track and reason about very quickly as the size of the codebase increases. As such, by the 2024 edition, we get a deprecation warning (or even deny-by-default lint):

// WARNING: `static mut` syntax is deprecated as of edition 2024 and is slated
// for removal in edition 2027. Consider using std::cell::SyncUnsafeCell<T> instead. 
// Read more at (somewhere, maybe rust blog post).
// Note/fix: 
// - extern "C" { static mut _c_symbol: Ty; }
// + extern "C" { static _c_symbol: std::cell::SyncUnsafeCell<Ty>; }
extern "C" { static mut _c_symbol: Ty; }

// WARNING: `static mut` syntax is deprecated as of edition 2024, and is slated
// for removal in edition 2027. Consider using SyncUnsafeCell<T> instead. Read 
// more at (somewhere, maybe rust blog post).
// Note/fix: 
// - pub static mut _rust_symbol: Ty = val;
// + pub static _rust_symbol: std::cell::SyncUnsafeCell<Ty> = std::cell:SyncUnsafeCell::new(val);
#[no_mangle]
pub static mut _rust_symbol: Ty = val;

If we try to do the same thing in the 2027 edition we get a hard syntax error for not migrating:

// ERROR: expected identifier after `static` (or similar)
// Note:
// `static mut` syntax has been removed as of edition 2027, use std::cell::SyncUnsafeCell<T> instead.
// Fix:
// - extern "C" { static mut _c_symbol: Ty; }
// + extern "C" { static _c_symbol: std::cell::SyncUnsafeCell<Ty>; }
extern "C" { static mut _c_symbol: Ty; }

// ERROR: expected identifier after `static` (or similar)
// Note:
// `static mut` syntax has been removed as of edition 2027, use std::cell::SyncUnsafeCell<T> instead.
// Fix:
// - pub static mut _rust_symbol: Ty = val;
// + pub static _rust_symbol: std::cell::SyncUnsafeCell<Ty> = std::cell:SyncUnsafeCell::new(val);
#[no_mangle]
pub static mut _rust_symbol: Ty = val;

Migration from static mut in favor of SyncUnsafeCell makes code easier to audit, as some operations previously unsafe to do on static mut (such as obtaining a raw pointer to the static) become safe, shifting focus fully to the areas where problems might arise (where the raw pointers are dereferenced) as it is at those points where we create references from raw pointers or use the raw pointers to access the underlying data. Keep in mind, however, that while SyncUnsafeCell is a less obvious type/technique to find (harder for beginners to fall into using) and a more verbose one to use, it is still highly unsafe and still does allow someone determined to create aliasing exclusive references to a place; caution should be taken by users of SyncUnsafeCell and UnsafeCell in general.

If we follow the diagnostics given by the compiler, we can migrate our code to a safer version of itself and make it easier to audit for any mistakes by better isolating where they can occur. The use of intermediate raw pointers to obtain references also produces marginally better output from the Miri tool which allows for better automated detection of problems in the code.

Reference-level explanation

A lint can be declared for 2024 with a FutureIncompatibilityReason of EditionError triggered upon detection of a declaration in HIR or the AST. For 2027 a check can be added to the Parser struct in the rustc_parse crate.

There is little to no use of static mut in the compiler, it is mostly present in std and in the implementation of the runtime and in a fashion similar to the above code examples; the few points at which static mut is used can be migrated to SyncUnsafeCell without causing too much ado.

Drawbacks

Verbosity increases slightly as values need to be wrapped in order to be placed in statics and methods need to be used to get at the underlying data.

Rationale and alternatives

Do nothing.
Deprecate static mut but don't remove it.

Doing nothing would result in a redundant feature that serves no purpose other than being a potential trap for users. Deprecation without removal could be done, but after the presumed migration of at least a majority of the ecosystem after the deprecation lint there is no reason to keep the feature in the language.

Prior art

Work/discussion directly relevant

Consider deprecation of UB-happy static mut at #53649
Disallow references to static mut [Edition Idea] at #114447
Deprecate static mut (in the next edition?) on IRLO

Notable, not directly relevant

SyncUnsafeCell at #95439
LazyCell/Lock at #109736
OnceCell/Lock at #74465

Many have tried to remove/deprecate static mut before, the feature is now sufficiently redundant and subject to replacement to put forth a plan for its eventual removal.

Unresolved questions

Should the deprecation lint be warn-by-default or deny-by-default?
Should we have cargo fix support for migration? It's technically possible, but reasonably high effort.

Future possibilities

I can't think of any besides what was mentioned as of yet.

END RFC TEXT

Hi everyone, this is my first RFC written and my first post here on IRLO, so any feedback is welcome and highly appreciated! I've also created a short fully anonymous survey to collect a large amount of data about this proposition, it's now on This Week in Rust #527.

Yoric · December 23, 2023, 2:54pm

Are there guarantees that std::cell::SyncUnsaceCell<Ty> has the same representation as Ty?

Nemo157 · December 23, 2023, 3:36pm

It's #[repr(transparent)] (other than presumably inheriting the niche suppression behavior of UnsafeCell, but that doesn't affect cases where it's the top-level type of a static allocation).

Jules-Bertholet · December 23, 2023, 5:15pm

A pub static mut can't be migrated to SyncUnsafeCell without breaking public API. I think this would be the first instance where adopting a new edition requires making a semver-breaking change.

carbotaniuman · December 23, 2023, 6:25pm

I don't think this is for any reason other than it hasn't come up yet. I can thing of Range changes and set_env also potentially requiring public API changes.

Yoric · December 23, 2023, 8:56pm

Well, this could start with private / function-scoped static mut.

Nemo157 · December 23, 2023, 9:33pm

It would be interesting to know if anyone has ever done that, that seems fraught with even more potential for footguns than crate-local static mut.

Jules-Bertholet · December 23, 2023, 9:59pm

https://sourcegraph.com/search?q=context:global+lang:Rust+pub+static+mut&patternType=standard&case=yes&sm=0&groupBy=repo

Nemo157 · December 23, 2023, 10:05pm

That doesn't detect whether those exports are actually reachable from the crate root. It does look like there are some FFI bindings that are, but generally FFI binding crates should be single-private-use (especially if they have unguarded static mut) so breaking version bumps of them when they want to change edition shouldn't be an issue.

pitaj · December 24, 2023, 2:20am

Yeah this seems like a case where actual breakage is very unlikely, and any breakage is probably deserved.

dyslexicsteak · December 24, 2023, 9:37am

The other idea for disallowing references to static mut at #114447 is also semver-breaking.

jhpratt · December 24, 2023, 11:32am

That wouldn't affect public API, though.

dyslexicsteak · December 24, 2023, 12:49pm

Why? I heard that if a large portion of usages of static mut broke then it's considered a breakage of semver even if there was no way to fix it by changing your API.

carbotaniuman · December 24, 2023, 4:33pm

Each use case can switch to taking a raw pointer and then turning that into a reference.

kpreid · December 24, 2023, 5:07pm

It would be possible to deprecate static mut without actually removing it. There is precedent to do this to address safety-usability mistakes; mem::uninitialized() is deprecated but not scheduled for removal.

dyslexicsteak · December 24, 2023, 5:18pm

Yes, this could even be achieved by cargo fix though I'm not sure if we really want to do that or if it's assuming too much in a way.

CAD97 · December 24, 2023, 5:47pm

...but it's brought up for discussion whether we can / should be hiding it from the new edition every cycle. But also mem::uninitialized is also quite special in being essentially unfit for any use^[1]. static mut: T is a giant footgun almost always better expressed as static: SyncUnsafeCell<T>^[2], but it's at least fit for its advertised purpose.

And essentially any use where it isn't immediate UB being an alias for MaybeUninit::uninit() or MaybeUninit::uninit_array(). ↩︎
FWIW, I don't think it's practical to deprecate static mut until SyncUnsafeCell is available in stable std, as that's the direct replacement, since UnsafeCell can't be used in static without a Sync wrapper of some kind. ↩︎

dyslexicsteak · December 24, 2023, 5:50pm

To be clear, we only want to do this if SyncUnsafeCell is also coming along for 2024.

Sahl · December 24, 2023, 5:54pm

I think there is a way to remove pub static mut across an edition boundary without breaking backwards compatibility.

Imagine you have an existing pub static mut in your crate:

pub static mut STATIC_MUT: u32 = 123;

This is used by downstream crates, using any edition in {2015, 2018, 2021}. In their code, usage looks like this

let r = &mut STATIC_MUT;
// or
let r = &STATIC_MUT;

With my proposal, you can change your crate in the following way:

pub static STATIC_MUT: SyncUnsafeCell<u32> = SyncUnsafeCell::new(123);

Now the downstream crates look the same, but you backport some new syntax:

let r = &mut STATIC_MUT
// desugars to 
let r = STATIC_MUT.get_mut() as &mut _;

let r = &STATIC_MUT;
// desugars to
let r = &STATIC_MUT.get() as &_;

This new syntax will exist for the {2015,2018,2021} editions but not for the 2024 edition. Additionally, this is very easy to apply in an automated fashion al a cargo fix.

There's precedent for this kind of syntax backfill with IntoIter for arrays.

Details:

Assuming pub static mut is pretty rare, this sugar won't come up often, likely significantly less often than the array into_iter example.

If you still want to get a &SyncUnsafeCell<u32> on those earlier editions, we add a method:

let r: &SyncUnsafeCell<u32> = SyncUnsafeCell::ref(STATIC_MUT)

This method will be a linter error on the 2024 edition, and cargo fix should replace it with a simple &.

We can (and should) also skip the sugar for the defining crate, since you're changing the static yourself you can change the rest of your code to use SyncUnsafeCell's accessor methods.

CAD97 · December 24, 2023, 6:53pm

Not really? The edition difference is exactly and only that <[T; N] as IntoIterator>::into_iter is hidden from method lookup in edition 2015. There's no extra special desugaring going on there.

The closest equivalent for static mut wouldn't be special sugar to make &[mut ]sync_unsafe_cell do &[mut ]*sync_unsafe_cell.get(), it'd be magic to make old editions see static: SyncUnsafeCell<T> as if it's defined with static mut: T instead.

It's honestly unnecessary to remove static mut from new editions. If it's safe to addr_of[_mut]! it and unsafe to &[mut] it (i.e. how we treat #[packed] fields with non-trivial alignment requirements), that puts a sufficient safety lock on the footgun and makes it effectively equivalent to using SyncUnsafeCell.

Topic		Replies	Views
Deprecate static mut (in the next edition?) language design	12	1311	March 6, 2024
[pre-RFC] Remove static mut	20	13392	March 25, 2019
Allow multiple mutable aliases language design	18	4095	December 2, 2022
Could static mut be made safe by an attribute on functions indicating they are not called concurrently? language design	35	1442	July 10, 2024
Why can't `&mut ()` be 'static? language design	7	1404	December 14, 2021