[Discussion] Adding an `init` function that could modify static variables directly

farnz · June 20, 2023, 11:22am

But we don't refuse to compile other code just because it's complex, and requires knowledge of evaluation order.

There's no undefined behaviour; by definition, since we're in Safe Rust, there cannot be undefined behaviour, and if there is, that's fatal to this feature. So I'm assuming that you'll find a way to structure this such that there is no undefined behaviour.

So, if you're going to make it refuse to compile, you need to find some way to clarify the difference between the following code which compiles today, and the code that you've decided won't compile, in a way that's friendly to the end-user:

mod one {
    pub static name: &str = "one";
    pub static total_len: usize = crate::two::name.len() + name.len();
}
mod two {
    pub static name: &str = "two";
    pub static total_len: usize = crate::one::name.len() + name.len();
}

This is code that compiles today, and where the behaviour is fully defined. If changing the type of one::name and two::name is enough to make it not compile, you need to justify this in terms of the language definition, in a way that permits code that you do want to allow to compile to compile.

Today's rule is that the code that initializes a static must be const-evaluated, and thus changing the types fails because I can't const-evaluate everything I can evaluate at runtime; your proposal needs a similar rule that I can follow to understand why my code doesn't compile - but defining such a rule is going to be complicated, because your example of how to fix it (with let mod_one_name) doesn't actually change types in any noticeable fashion, and thus doesn't necessarily help.

mathstuf · June 20, 2023, 11:52am

In my experience, global constructor and destructor problems are among the worst to deal with. First, it is shared global state which…Rust was very wise to put some speed bumps around. Second, controlling the order of construction is very difficult, but the order of destruction (should you have any problems) is even more difficult to control.

FWIW, C++20 "fixed" the problem insofar as global constructors of imported modules are called before the current module's global constructors^[1], so at least one can make an order there. However, Rust does not ban cycles between modules like C++ does.

IMO, global constructors just don't pull their weight to make up for the Pandora's Box they otherwise end up becoming.

I don't think there's any intra-module ordering possible beyond the typical "use function calls to initialize 'later' objects" and hoping things work out everywhere. ↩︎

Neutron3529 · June 20, 2023, 12:17pm

This paragraph is copied from here, which explain the reason why rust does not have increment and decrement operators like i++ or n--.

for(int i=0;i<10;i++) compiles in C, and where the behaviour is fully defined. when we change to Rust, it does not compile since it is complicated. This could be part of the language definition, in a way that permits us trust our code does not have any UB, not just allow anything could compile to compile.

I've tried that, rust does not allow cycle dependencies. Thus provide a crate::init to dynamically allocate things at runtime and later move them to static variables could be suitable.

My initialization could be regard as a combination of "executing function that return tuple of values", and "deconstruct move those values into static fields".

since rust could deal well with (de)allocate variables in functions, the order of destruction is not difficult to control.

farnz · June 20, 2023, 12:46pm

Yes, exactly. But for us to have a sensible discussion about your proposal, you need to explain how you're going to change the language definition to support #[init] such that it has no UB, and has a well-defined meaning.

Your code is potentially unsound, because I can call table(99, 88) without first calling init(); you can fix this without additional runtime costs by using a ZST trick:

mod tables {
    use std::sync::OnceLock;
    /// SAFETY: It must be impossible to construct this without initializing TABLE below
    #[derive(Copy, Clone)]
    pub struct TableInit {
        _private: () // A zero-sized member to prevent this being constructed unsoundly
    }

    static INIT_DONE: OnceLock<TableInit> = OnceLock::new();

    pub const N: usize = 1000; // or whatever
    static mut TABLE: [[i32; N]; N] = [[0; N]; N];

    pub fn init() -> TableInit{
        *INIT_DONE.get_or_init(||
            unsafe {
                // initialize TABLE

                TableInit { _private: () }
            }
        )
    }

    pub fn table(_: TableInit) -> &'static [[i32; N]; N] {
        // SAFETY: You cannot create a TableInit without initializing TABLE.
        unsafe { &TABLE }
    }
}

In this code, TableInit is a zero-sized type (thus no runtime cost). You get one whenever you need it by calling tables::init, which uses a OnceLock internally to guarantee that initialization happens only once; it's Copy, so that once you have a TableInit, you can copy it. If you reach a point where you need a TableInit, but don't have one, you can call init() to get one; you will need to do this at least once in your code if you use table()

Each call to init() pays the cost of the OnceLock, and this is unavoidable. But if you have a TableInit from somewhere, you can copy it as many times as you like - it's just a witness that your initialization function has run. And because it's a ZST, it doesn't exist at runtime - no instructions will be emitted to handle it at all, no matter how many copies you make.

mathstuf · June 20, 2023, 2:11pm

Crates can't be cycles, but modules can.

This compiles just fine:

% for x in src/*.rs; do echo "==> $x"; cat $x; echo; done
==> src/a.rs
use crate::b;

==> src/b.rs
use crate::a;

==> src/main.rs
mod a;
mod b;

fn main() {
    println!("Hello, world!");
}

If a and b have static initializers, which runs first?

Neutron3529 · June 20, 2023, 2:16pm

This is the question. You could only write only one static initializers.

If you want to initialize static fields in either a or b, you could write things like

mod a{
    static a:i32=crate::a;
}
mod b{
    static b:i32=crate::b;
}
let a=1;
let b=2;
static static_field:i32=3;
fn main(){...}

No matter what you wrote, you are not intend to run several different init functions in arbitry order.

toc · June 20, 2023, 2:16pm

Unsound! But yes I have looked at init-token and considered the approach. In this case (coding a chess ai) there is a big public init which calls a number of private ones and it will all definitely blow up if you haven't called init. Most paths towards calling that API force you through some bottleneck which will call init (for instance, creating a new UCI interface). I considered it important to be able to call individual free functions fluently; simple_calc(x, y) looks the same as table_based_calc(x, y, z). And a part of me is more confident that this will optimize correctly:

#[inline]
pub fn table(x: usize, y: usize) -> i32 {
    // Safety: assume init.
    unsafe { TABLE[x][y] }
}

Neutron3529 · June 20, 2023, 2:40pm

Firstly, one init for one module. I provide 3 ways:

add #[init] to ensure there is a specific init function and disable other possibility (such as syntax sugar)
wrote let at the top of the crate. Everything could not be calculated in const runtime should belong to the init function. If such grammar exists, a function named init is automatically defined.
add #![no_init] lint, which disable grammar sugar, but it cannot disable explict #[init]. A warning should be generated if both #![no_init] and #[init]fn init() exists.

For calling init function from seperate crates, maybe we could write a #[no_init] attribute to disallow unauthorized static initializers (is it really needed?). But if we have no good rules to stop some initializers, adding a no_init feature may help.

Since the initializer executed by the dependency order, its calculation have a well-defined order thus no UB is generated.

Since the init function could be easily rewritten follow and each part is sound, thus it is well defined:

static mut field:i32=0;
fn init(){// initialize field;
    field=42;
}
fn main(){
    init();// call init function that mutate static variables
    // here field is initialized. No proof is needed.
    // function body
    let x=*{unsafe{&field}};// every visit of field is read-only, and field is initialized, thus it is safe.
    println!("field is {}",x);
}

Where is the problem?

the8472 · June 20, 2023, 3:04pm

FYI, the the chrome devs have asked to minimize use of static initializers in std, citing dependency/ordering and program startup speed issues with initializers in C++. While that isn't necessarily relevant for external crates it's still an indicator that their use is questionable.

farnz · June 20, 2023, 3:50pm

I can promise you that the compiler will optimize correctly with or without ZSTs; by the time you get to optimizing, the ZSTs have disappeared anyway, since they have no runtime representation.

The rest of your points are situational, and coding it my way or your way is a matter of taste, and of whether or not users will be OK with a runtime blow-up if they forget to call init() or not - the token is just a compile-time proof that they've called init(), not anything more expensive, but if you can get that proof another way (e.g. "I'm the only person who calls into this module, and I get it right"), that's perfectly fine, too. It's just worth knowing about the ZST trick, since it compiles down to nothing, but forces users to prove that they called init(); you'd also get this if mod tables is an implementation detail of a struct ChessEngine, and there's no way to create a ChessEngine that doesn't call tables::init().

Neutron3529:

Firstly, one init for one module. I provide 3 ways:

add #[init] to ensure there is a specific init function and disable other possibility (such as syntax sugar)

wrote let at the top of the crate. Everything could not be calculated in const runtime should belong to the init function. If such grammar exists, a function named init is automatically defined.

add #![no_init] lint, which disable grammar sugar, but it cannot disable explict #[init]. A warning should be generated if both #![no_init] and #[init]fn init() exists.

For calling init function from seperate crates, maybe we could write a #[no_init] attribute to disallow unauthorized static initializers (is it really needed?). But if we have no good rules to stop some initializers, adding a no_init feature may help.

Since the initializer executed by the dependency order, its calculation have a well-defined order thus no UB is generated.

There's no dependency order between modules, so you've not fully defined the initializer order; this is a significant problem:

static mut field:i32=0;
fn init(){// initialize field;
    field = mod1::field + mod2::field;
}

mod mod1 {
    static mut field: i32 = 1;
    fn init() {
        unsafe { field = mod2::field };
    }
}

mod mod2 {
    static mut field: i32 = 2;
    fn init() {
        unsafe { field = mod1::field };
    }
}

fn main(){
    // Which of the following is correct?
    {
        mod1::init();
        mod2::init();
        init():
    }
    // or 
    {
        init():
        mod1::init();
        mod2::init();
    }
    // or
    {
        mod2::init();
        mod1::init();
        init():
    }
    // or another permutation

    // here field is initialized. No proof is needed.
    // function body
    let x=*{unsafe{&field}};// every visit of field is read-only, and field is initialized, thus it is safe.
    println!("field is {}",x);
}

Further, you've still got unsafe accesses - ideally, you'd remove the unsafe in let x=*{unsafe{&field}}; since the field isn't really static mut, because it stops being mut after initialization time. But that's another set of details to iron out to make a good proposal.

On top of that, you have the problem of things like WASM, where you don't have a main at all, nor do you have static initializers called by the runtime; you have to handle the case where an exported function is called directly, without any other code called first. The current system handles this nicely, because of the nature of lazy initialization; if you're the first person to pull on a OnceCell or OnceLock, then you pay the initialization cost.

And this isn't just a WASM problem - it's a useful property to be able to assert when I create a cdylib crate, since my users may be surprised when a simple dlopen/LoadLibrary takes excessive time due to constructors, especially if those constructors relate to code that I'm never going to use. Again, lazy initialization handles this nicely for me - I don't pay the initialization cost until I use the thing.

Neutron3529 · June 20, 2023, 4:01pm

Thus, model level initialize function should not exists. Because you cannot simply assume the initialize order. And you could initialize them in the main init function safely.

If you really want to write complete logic as you mentioned above, then write them explictly is better.

Neutron3529 · June 20, 2023, 4:07pm

we could wrote a feature or just something like #[no_init], to provide fallbacks when default initialize is not allowed.

For UI, they want to start as soon as possible, and calculate things in a acceptable time. But actually, for scientific computations, the start time is not relevent, it is stop time that taking concerned. If static initializer could make program run faster, then it should be used.

farnz · June 20, 2023, 5:41pm

This then makes the feature difficult to hold correctly - if I disable your crate's #[init] from running, how does your crate cope with the fact that its init function hasn't been run? This needs design effort to make sure I can't introduce unsoundness this way, either.

Note that static initializers are never going to be faster than explicitly passing the "global context" around.

In other words,

static mut data_table: [usize; 1024] = [0; 1024];

fn init_table() {
    // Initialize the elements of `data_table`
}

fn do_maths(more_params: Params) {
    // uses `data_table`
}

fn main() {
    init_table();
    for params in … {
        do_maths(params);
    }
}

will never be faster than

fn get_data_table() -> [usize; 1024] {
    // Initialize the table
}

fn do_maths(more_params: Params, data_table: &[usize; 1024]) {
    // uses `data_table`
}

fn main() {
    let data_table = get_data_table();
    for params in … {
        do_maths(params, &data_table);
    }
}

As the latter is easier to prove correct, and it's at least as fast, it's the right thing to do.

toc · June 20, 2023, 7:41pm

Threading parameter globals and contexts all the way through from main can be a nonstarter though. For a simple math function, that may be used in many scattered places, this requirement might essentially disable optimizing the function with a table lookup, unless one is willing to perform a wildly disruptive refactoring. Even more so if the optimization is merely speculative.

Imagine asking for f64::sin to now accept a TableInit. Obviously the scale here is less disruptive than that would be, but this is an optimization I would like to be able to do and undo relatively easily. This is a valid use case that is not well supported in rust.

I personally think global initializers would add value to the language. I don't necessarily think they need to solve all of the theoretical problems to be useful. A simple explicitly-run rerun-on-reload deadlock-on-reentry solution would solve 100% of my use cases and I expect it would solve 99% of real world use cases. This does seem like a good candidate to solve at the language level, or in the least case to provide hooks such that library authors can implement it.

JeffBurdges · June 20, 2023, 9:39pm

I think static_init covers this nicely, so not really necessary in std.

NoamB · June 20, 2023, 9:43pm

Isn't the crate graph non-cyclic? This provides a well defined init order, provided initializers are per-crate. Static initializers are still problematic - they need to be called recursively, since a transitive dependency may need an initializer. This could work by having an implicit OnceLock per explicitly defined initializer, and 'initializer-glue' that calls all dependency initializers.

A potential issue is crates which assume that there is only one version of themselves linked, and lock some system resource. This can be a problem today, but I feel like static initializers encourage this sort of thing.

Another possibility is to have the feature be opt-in (like std/alloc, but disabled by default). This would cause some fragmentation though.

We could also give the final binary crate the (unsafe) option to disable automatic initialization, and call the initializer glue explicitly from main.

farnz · June 20, 2023, 10:44pm

And for that use case, lazy initialization already exists, via LazyLock (being worked on) and OnceLock (already stable). This is not about the gap between "can't thread a global context through" and "can thread a global context through"; this is about the performance difference between "global context does a cheap check to ensure that you've initialized it, and does the expensive fix-up if you haven't" and "global context causes misbehaviour up to and including UB in some cases if you haven't initialized it, but is faster by the cost of one check of a single enum discriminant on each access to global context as a result".

Further, the compiler already does a very limited degree of value range analysis (it uses VRA to elide bounds checks when indexing, for example, where it's already seen some evidence that the bounds check is guaranteed to succeed). Extending this so that it covers enough cases that OnceLock and LazyLock are completely free as long as they're initialized early enough in main would be enough to make global initializers completely unnecessary - those of your users who care about performance can call crate::init() early enough in main to get VRA to do its magic, and those who forget to call crate::init() just see marginally slower performance. And extending VRA like this benefits more than just global state - it benefits any case where knowing the upper and lower bounds of a value is enough to improve codegen.

Modulo performance, we already have global initializers via OnceLock, and LazyLock is nearly complete and also provides a global initializer mechanism that's easier to get right (since you supply the code that creates the value at initialization time, not use time). These are safe to use, don't deadlock, don't need you to remember to run them, and all at the cost of one enum discriminant check each time you access the global context; this is typically a load instruction, a compare instruction, and a correctly predicted branch over and above global initializers.

This puts us in a difficult corner of design space - the best solution is to use static foo: OnceLock<_> or LazyLock<_>, get guaranteed correctness, and be aware that we're leaving a load, a compare, and a predictable branch's worth of performance on the table (3 machine instructions per function that needs to access the global data) with the current state of the Rust compiler. The feature we're trying to get right here is a more restricted version of global initialization that removes that 3 instruction penalty on access, for the cases where you really, really can't afford that.

The problem is that we want to avoid having any OnceLocks, since those are the thing that we're trying to avoid with global initializers. We'd also like, if possible, to have multiple initializers per crate, since you can do that with const initializers as supported today; this means that we need clear rules on what you can and cannot do with global initialization, to avoid confusion when you try to change your code using const initialization and OnceLocks or LazyLocks to use non-const initialization instead.

If you're happy with this, you can have it today, with a per-crate Once to ensure that initialization only happens once in your crate, and an initializer function you must call from main to avoid your crate's code crashing. But, as Rustaceans, we're not happy with a solution that says "don't forget to do the init dance or some random code later on crashes", preferring solutions that give you "you can't forget to do the init dance" - so the question we're trying to dig into here is whether we can design global initializers such that (allowing for WASM, binaries and cdylibs that don't want their dependencies to run initializers before main but instead run them during main, and other problems from the thread) we can have the safety of LazyLock without the 3 instruction access penalty for an initialized LazyLock.

If the answer is that we can't, then this is functionality that belongs in a crate (like ctor), so that the Rust language isn't bound up implementing a feature that's been made worthless by advances to the compiler, but that still has to exist because we can't port users over to LazyLock and rely on VRA doing the right thing in release builds.

Neutron3529 · June 21, 2023, 1:56am

Same things happened to #![forbid(unsafe_code)]. init could be a feature, and crate maintainers could provide fallbacks if the crate is not allowed to initialize statically.

I strongly doubt about that.

do_maths(params, &data_table); should be slower since it passes one extra parameter(&data_table). If multiple parameters are passed, I can't believe that is no cost.

Further, function pointer could also be passed. and if a func is passed, compiler may do some extra optimization at init step to allow those calls become static call rather than slower version like call %rax

farnz · June 21, 2023, 8:17am

I used Criterion to benchmark the following code;

use std::ops::Range;

pub const SIZE: usize = 16 * 1024;

pub static mut DATA_TABLE: [usize; SIZE] = [0; SIZE];

/// # Safety
///  This initializes DATA_TABLE
pub unsafe fn init_data_table() {
    for (index, elem) in unsafe { DATA_TABLE.iter_mut().enumerate() } {
        *elem = index * index;
    }
}

pub fn use_data_static(range: Range<usize>) -> usize {
    unsafe { DATA_TABLE[range].iter().sum() }
}

pub fn use_data_param(data: &[usize; SIZE], range: Range<usize>) -> usize {
    data[range].iter().sum()
}

comparing the performance of use_data_static to use_data_param:

use benchmark_static::*;
use criterion::{black_box, criterion_group, criterion_main, Criterion};

pub fn bench_global(c: &mut Criterion) {
    unsafe {
        init_data_table();
    }

    c.bench_function("global", |b| b.iter(|| use_data_static(black_box(0..100))));
}

pub fn bench_param(c: &mut Criterion) {
    unsafe {
        init_data_table();
    }

    let data = unsafe { &DATA_TABLE };

    c.bench_function("param", |b| {
        b.iter(|| use_data_param(data, black_box(0..100)))
    });
}

criterion_group!(global, bench_global);
criterion_group!(param, bench_param);

criterion_main!(global, param);

I get:

global                  time:   [7.9314 ns 7.9535 ns 7.9812 ns]
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) low mild
  7 (7.00%) high mild
  5 (5.00%) high severe

param                   time:   [7.8996 ns 7.9035 ns 7.9072 ns]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  2 (2.00%) high mild

These are close enough that it's reasonable to say that passing a parameter is the same cost as using a global - while I see better numbers for passing a parameter, it's close enough that this could easily just be system noise.

SkiFire13 · June 21, 2023, 9:09am

#![forbid(unsafe_code)] can't disable unsafe in dependencies, it's not a feature and can't be detected by dependencies. In fact the compiler doesn't even know about it yet while it's compiling the dependencies.

If you allow running arbitrary code at init time then I highly doubt you can make extra optimizations compared to normal code you can run at runtime.

Topic		Replies	Views
Deprecate static mut (in the next edition?) language design	12	1284	March 6, 2024
Static generics	5	1538	March 25, 2019
TaintedCell - interior mutability with shared mutable access language design	8	450	October 16, 2024
Non-static data in `static` language design	10	282	October 24, 2024
Pre-RFC: Deprecate then remove `static mut` language design	106	5193	July 5, 2024

[Discussion] Adding an `init` function that could modify static variables directly

Related topics