Pre-RFC: Add language support for global constructor functions

The interaction with no_std is definitely something that would need to be figured out!

If this is implemented using the same infrastructure as C++ static initializers, then the infrastructure they would run on wouldn’t depend on allocation or anything else in std. It could even still run without the rust initialization in #![no_start].

I guess if we’re implementing it as part of the rust start code, then there’s more to figure out here. Thanks for bringing it up!

With the multiple functions, this is necessary in order to allow things like inventory to function. If we were only allowed one function per crate, there wouldn’t be a reasonable way for an attribute-based macro to register multiple things on independent items to run on startup. As this kind of “registration” is one of the main use cases, I would be inclined to say that having multiple functions is needed.

I just mean this as far as it is true in rust in general. The following isn’t valid rust, and has never been valid:

static X: &str;

Global constructors wouldn’t allow you to write this either - all statics would need to be initialized to something, still. Even std::mem::uninitialized() is non-const, so I don’t think it’s possible at all to do this?

Maybe using MaybeUninit, but even then, it’s not really “uninitialized” since unsafe code is required to read it later. This is a concern, but I don’t think it’s a large one.

To be clear, I’m proposing not changing these rules. Global constructors would not introduce the ability to have uninitialized statics, nor take it away (since we don’t currently have it).


With all that said, I think going forward with a distributed_slice or linkme-based method is probably more sane. I’ve not put together a proposal, but I’ll probably be trying to do that later next month or sometime soon.

2 Likes

If this is implemented using the same infrastructure as C++ static initializers, then the infrastructure they would run on wouldn’t depend on allocation or anything else in std . It could even still run without the rust initialization in #![no_start] .

If it would work out of box that would be ideal, but I believe it leaves concerns for dependency on std as some statics may need memory allocations and friends. It is been a while since I looked at Rust runtime, but most likely panics would be UB in this case, but heap allocations are still should be allowed(need to check at which step global allocator is et up)

Even std::mem::uninitialized() is non-const, so I don’t think it’s possible at all to do this?

I believe a better support for global constructors would require it to become const, let say as extra feature. While it is true we can use Option and friends, it is overhead. In short ideally we need const way to have uninitialized static for global constructor to initialize

1 Like

mem::uninitialized is on it’s way to being deprecated, so I wouldn’t count on it. Instead you could use mem::MaybeUninit, which has no overhead, to deal with uninitialized data.

2 Likes

mem::MaybeUninit is not const friendly yet too. It doesn’t really matter which we’d use, potentially it should be const friendly

1 Like

I intentionally excluded this feature from the RFC - as I understand it, allowing access to uninintialized constants when initializing other constants is one of the main bad things about C++'s static initializers. I hoped that by not allowing global constructors to initialize statics (forcing use of a mutable static with Option or similar instead), we could avoid that problem.

What’s your use case for uninitialized constants in particular?

3 Likes

when initializing other constants is one of the main bad things about C++'s static initializers.

There is nothing bad about it, it is just you cannot depend on initialization order of statics.

I don’t understand why you want to avoid user intentionally using uninitialized on global statics. How it is different from using it on non-global variables? It is not different, when using unsafe user takes responsibility to use unitialized variable properly when relying on global constructor functions.

What’s your use case for uninitialized constants in particular?

Initialize global variable that lacks const fn initializer (ideally I’d like to avoid mut for global statics that need one time initialization)

static X: Option<T> = None; not sufficient? Who cares about mut? This is already unsafe code, and that static definitely can’t be inlined.

Option is not necessary for one time initialization, just adding few bytes to the size of executable

Also currently Option<T> would require mut static in order to perform initialization. Basically we’d want to replace lazy_static kind of initialization

2 Likes

lazy_static also requires a Option<T>-like structure under the covers, so it’s not adding overhead to use a Option<T> with main-entry initialization rather than first-use. In fact, I’d expect a macro to wrap main-entry initialization in an API like lazy_static’s, to allow it to be used safely (unlike static mut, which is next to impossible to get right).

And if you’re really determined to use an uninitialized state to start with, I’m fairly certain that MaybeUninit::uninitialized doesn’t have any actual barriers to being const other than deciding what that means.

1 Like

@drXor Writing to a non-mut static is UB, just like writing to anything that has a non-mut reference pointing at it. The only exception in either case is UnsafeCell.

I’m totally aware that impl Freeze statics wind up on a readonly page; my point was that I think there is somewhat unnecessary fear of static mut in the context of library ctors, which are already very hard to do safely.

I’m not talking about readonly pages, I’m talking about miscompilation due to constant propagation et. al. assuming you never write to that non-mut static.

Did you mean to include a mut in your post?

Yeah, I think we’re talking past each other here. I also may have written that post before my weekend coffee- definitely meant a mut static. I blame getting mixed up from C++ having the opposite convention.

It is const. (And it’s called MaybeUninit::uninit but whatever. :wink: )

5 Likes

There is another use-case I realized that I have: enumerating test cases. I maintain a library for data-driven tests and currently the way it enumerates tests is by using a nightly-only feature of providing my own test runner (https://github.com/rust-lang/rust/blob/master/src/doc/unstable-book/src/language-features/custom-test-frameworks.md) and using #[test_case] on my tests to let the compiler to enumerate all of them and pass to the function I control (I also use a nasty hack to allow accepting my own descriptor instead of test::TestDescAndFn).

Seems like global constructors could be an alternative to that? So instead of relying on syntactic transformation compiler performs (which is explained here https://blog.jrenner.net/rust/testing/2018/07/19/test-in-2018.html), it would be a global list of "test cases" (or even multiple lists, for different test types, and then test runner should know how to handle lists of different types?).

1 Like

I think enumerating test cases (or anything of the form "global list") would be better served by Idea: global static variables extendable at compile-time. This sort of thing does not require global constructors.

5 Likes

I saw in that thread that you've basically written a library for that purpose. If it can be done in a library (and it's done really well, with no real ugliness at usage time), why should it be a core language feature? Having a declaration which looks like a regular array but is missing some of its elements sounds confusing enough to me to warrant the extra thought imposed by depending on a library (especially considering how solutions to some other important problems, like date/time handling or random number generation, weren't deemed "core" or "stable" enough to make it into even the stdlib, let alone the core language).

1 Like

The main reason I would argue for it would be to allow proper support an all platforms, not just those with intrinsics already available to implement the functionality. In particular, I believe only the compiler has the ability to coalesce values together in wasm-unknown-unknown - it's just doesn't support link sections as a target.

If it was possible to implement it on all platforms in a library, then I would agree with you. But if the alternatives end up being a library supported only on Windows, OSX and Linux, and a std implementation with support for many for platforms, then I'd go for the latter.

3 Likes

Are you talking about https://crates.io/crates/linkme ? It uses similar linker magic to https://crates.io/crates/ctor and https://crates.io/crates/inventory and suffers from the same issues as these two (in my experiments, all three just silently stop working in some cases).

There is like really not much one can do without compiler support here, I don't think so :confused:

1 Like

I'm writing a plugin in Rust for a C application that expects plugins to do all initialization via static constructors. While the described alternatives solve most problems in pure-Rust projects, they don't help with this kind of FFI.

3 Likes