Pre-RFC: Add language support for global constructor functions

It works on all tier-1 platforms.

As a normal developer, the code you're writing is likely to last a long time; if you're writing a library, you also want other people to be able to take dependencies on you without any issues, even if they're doing something weird. So it's very beneficial to have guarantees: that your code will work even on obscure platforms, that it will continue to work in the face of compiler upgrades, that it won't have broken edge cases (like this one I previously linked).

As an attacker, you mainly care that it works in practice, on a specific target's system, in the short time window of your attack.

If there's no C compiler, the attacker could just ship static library binaries in the crate for whatever platforms they care about – not that there's a reason to do so when the link_section solution above works fine.

Possible, but unlikely, meaning that omitting #[global_constructor] would only make a difference in an edge case, even if not for the many other workarounds.

1 Like

That's not quite true; they outline the code so they help on anything with a cache.

But also they're not semantic, so "do nothing" doesn't change the behaviour of the code. Having some platforms not run your static constructors is very much a behaviour change.

4 Likes

Rust already has adequate support for putting things in arbitrary sections; that's how rust-ctor works. However, there are benefits to letting LLVM know about them by adding them to @global_ctors:

  • LLVM has an inter-procedural optimization pass for global constructors.
  • It takes advantage of LLVM's existing support without needing to duplicate the logic to pick the right section for each supported platform. The logic in LLVM isn't that complicated overall, but there are some relatively obscure things, such as:
  • Compatibility with weird LLVM things that query global_ctors like ORC (JIT) and KLEE (symbolic execution); even LLDB (debugger) has something that checks for them.
  • Ensures section flags are set right (well, ideally Rust should support this as well, but the syntax is platform-specific and C doesn't support it, so it's a bit messy)
5 Likes

I went and stared at LLVM's documentation for llvm.global_ctors (or, lack thereof...). It allegedly accepts a priority argument (which C doesn't use because there's no meaningful stated dependencies between TUs). Since Rust is aware of an object dependency tree (at least, among crates), is this something we want to use? Of course, within a crate order is a bit sad... we could either

  • Assert order is undefined, which seems pretty reasonable; if you care about order within your crate you can shove it into a single #[ctor].
  • Declare that there may be at most a single #[ctor] per crate (so, the above, but without the undefinedness).
  • Allow ctors to specify a crate-local priority #[ctor(priority)], ties are broken in an unspecified order. This one strikes me as kind of really insane.

Also, I assume that in this whole conversation we tacitly assume all ctors are shoved into llvm.used, right? I think a lot of the sadness in C++ comes from the linker forgetting about your initializers for spooky reasons (at least, that's been my personal experience), and this seems very avoidable.

An extreme example of this is Go: The Go Programming Language Specification - The Go Programming Language. I'm not sure that I think this approach is totally reasonable for Rust, but certainly worth contemplating.

1 Like

Nit: although C doesn’t automatically assign priorities, you can specify a priority in the attribute syntax, e.g. __attribute__((constructor(0))).

(Oh, and just to reiterate, just because I’ve been making counterarguments to various arguments against this feature doesn’t mean I think the feature is a good idea. I think it may be better to start with some kind of metadata-based, type-safe “put this static into a global list” feature, and see if anyone is still clamoring for constructors after that.)

3 Likes

I think it wouldn’t be too hard to define a partial order in which global constructors are called:

#[global_ctor( ensures = [foo_loaded, bar_started] )]
fn first() {}

#[global_ctor( requires = [foo_loaded, bar_started],
               ensures = baz_created )]
fn second() {}

#[global_ctor( requires = baz_created )]
fn third() {}

This is how rustc can determine the partial order:

foo_loaded, bar_started and baz_created are booleans that are set to false at first. Rustc iterates over the global constructors and selects the first one for which all requirements are true. After that, it sets the variables in its ensures list to true. Rustc continues doing this until no global constructors remain (or returns a compiler error if some requirements can’t be fulfilled).

I dislike global constructors but I love the idea of explicitly calling global constructors! I imagine it could work well with the restrictions:

  • there is at most one function that is called before main (the #[global_constrcutor] defined in the bin)
  • compiler shows warnings if some crate provides #[global_constrcutor] fn but it is not called - not sure if this should work for ‘all’ or ‘any’ execution path

With this restriction, why do you need it before main at all? Can't the programmer can just make this "constructor" the first thing main calls?

edit: Oh I see, @toc showed it that way, but you two also want a warning/error if that's forgotten.

Perhaps you could accomplish that at the type level if the constructor actually returns some kind of token, which you then require as a parameter where that initialization is required.

2 Likes

This pattern can be used now with global or god objects that just have the entire API as methods. I don't think it can be made to work for typetag though.

Hi! Want to thank everyone for this- there's a lot I hadn't considered, and I'm probably not the most qualified person to be writing this.

Regardless, here are some responses to individual points! Hope this mega-post format isn't the worst thing ever to read through...

I'll edit the text in the initial post too, if that's alright.

Thank you for mentioning this! I was thinking keeping with what rust-ctor does currently might allow for a nicer implementation, but if we can get stdlib to always initialize prior to these, that'd be all better.

I'm going to optimistically edit the pre-rfc text removing this limitation!

This also brings up the question of whether these actually need to be unsafe, or not. If we can fix stdlib being accessed, there might not be anything inherently unsafe about using global constructors.

I was wanting the order to be explicitly undefined so that code wouldn't rely on anything, but I guess having some control could be good.

My only use case for this was as a backend for the inventory crate, but initializing FFI libraries seems like a good use case for this!

Do you know if current libraries using rust-ctor could use lazy_static! or std::sync::Once instead? I guess I'm wondering if there's anything ctors allow which nothing else does, or if it's for the performance benefit.

Replacing lazy_static isn't something that I was aiming at, but I can see this doing that. I'll add it in... somewhere? Not sure.


It would certainly be interesting if we could implement something to back inventory without global constructors!

I guess that wouldn't necessarily help with FFI initialization, but if we could solve at least one problem here without global constructors I would be for that.

I'll add this to the alternatives - at a cursory look, it seems more different than similar to global constructors, but it could definitely be a good alternative solution.


The main problem now is that either a) only the binary crate would be able to add global constructors, or b) binary crates would be forced to call into all global constructors for all the libraries they use (recursively) and adding a new global constructor to a library would be a breaking change.

My main use case for global constructors is coordinating different libraries which don't know about eachother, and want to all add to some global data store. Like if library A uses typetag to create serializable trait, libraries B and C should be able to add data to the "all types implementing this trait" global list without A knowing about them.

Using an explicit solution like this, any binary crate depending on B and C would have to call the global constructors for each of their serializable types manually, even if those types are purely internal and the user shouldn't have to care about them.

The biggest disadvantage that I see, though, is that then library C can't add a new type which uses a global constructor without a new major version. Since adding a global constructor forces all consumer crates to now add a new line to their main function, it becomes a breaking change anything involving global constructors.

This is true!

I haven't mentioned lazy_static as I don't believe it solves anything similar to the same problem, but I might not have really given a good explanation for that. It's true that global constructors would be able to do some of the same things lazy_static can do, but they can solve one extra case: when the crate using the global data has no idea the crate providing the data exists.

I'm adding more to the pre-RFC text, but here's another demonstration.

The best example I have is typetag. Say I have a logging crate which allows for various logger configurations, and can serialize those configurations into JSON. In my logger crate, I define a trait SerializableLogger, and use typetag on it.

Another crate, say logger-syslog-adapter, can then define a concrete implementation SyslogLogger.

When the consumer uses logger to deserialize their configuration, they want to be able to have it "just work" and deserialize it. With global constructors, typetag registers SyslogLogger into a static list of implementors of SerializableLogger. Then when the configuration is deserialized, if a syslog logger was specified, SyslogLogger is automatically grabbed and used as the logger for the Box<dyn SerializableLogger>, without logger ever mentioning logger-syslog-adapter in its source code.

This would have been impossible to implement with lazy_static as lazy_static requires the code providing the constructor to have been called at least once. But when dealing with cross-crate data like this, it's natural to only ever specifically call logger-syslog-adapter when setting up and serializing the data, not when deserializing it.


Just want to say I'm super glad to have a different viewpoint on this. I'm not too experienced with these, and I'm really glad to have your input on this!

Done.

Thoughts on using #[unsafe_global_constructor] instead? I wanted to include the word unsafe in some way since, if we don't fix them being before stdlib initialization, they can break things. But I can see how unsafe fn is the opposite of this.

I... kind of get this, but isn't using libraries at all a security concern?

If an end user depends on a library, then it seem reasonable to assume that they are calling at least one of that library's functions. Sure, it might make debugging more annoying if the library's doing something odd on program initialization, but if we're depending on it and including its code in the end binary, I think we're trusting the library.

When reviewing a library, I would think global constructors should be able to stand out. If nothing else, keeping them unsafe in some form or another should highlight them compared to other (safe) code.

If we expect global constructors to be a niche feature, I would agree with you on this. But I feel like the more libraries use it for small things (like typetag traits), the less this would mean.

What would you propose the behavior be when the user doesn't call run_all_hooks()? If we have not calling it being an error, new users will probably just stick it in there anyways - and requiring use of unsafe just to use various libraries will devalue unsafe.

If it's silently allowed, or even with a warning, then suddenly parts of crates people depend on might just not work. If this is used for FFI initialization, we could end up in unsound territory, or if it's just for things like typetag, then deserialization could just fail at runtime.

Unrelated to the above, but I hadn't thought of using LLVM as a downside. I'll add that in.

Sounds reasonable- if this ends up as a full RFC, having it only for one platform would be... bad. I was thinking of this as an alternative for testing, but I guess that's still bad language design.

I usually would to, but I would hope the global aspect of it offsets that. I'm opposed to initializer because of it's connotation with initializing a specific value somewhere. Keeping away from initializer would also help differentiate this from C++ static initializers, which are indeed intended to initialize a single static value.

If this initializes a crate, though, it kind of is constructing the crate's global state. I guess that's fairly similar to initializing the global state...

I'll add #[register_main_hook] in as an alternative, but if I found this somewhere in the code I think I'd have even less of an idea of what it does than global_constructor or _initializer.

I mean, I haven't thought through this? Good questions. I will try to add to these sections if I or anyone else comes up with reasons for either side.

This is one use case, but not my primary one. I will elaborate in the edited post.

Thank you for linking this!

I am excited about the possibility of this alternative, and will have to look into it more.

I'm planning on at least expanding this pre-RFC a bit further in its current direction to collect this knowledge and try to explore it? But you're right, the solution here doesn't really match up with the problem. I started approaching this from the point of view of "LLVM has a global_ctors attribute, and we use rust-ctor to solve the problem, so using rustc to take advantage of global_ctors seems like a good idea", not necessarily looking for the best language design solution.

My understanding is that this kind of built-in data store is less charted territory, but that's not necessarily bad. If we can have our cross-crate-coordination cake and (with no runtime cost) eat it too, that would be pretty great.

Adding this to unresolved questions. It seems like this could depend on whether we implement this using LLVM's global_ctors, or as part of the main shim if that proves useful for running after stdlib initialization?

I will be attemting to understand & follow links on @comex's explanation of this. (thanks for that!)

I think this ties into @dtolnay's post above.

My personal use case is directly just "using typetag"...

Just looking at rust-ctor's dependent crates for ideas, there's also

Not sure how useful that is. I can see the ANSI escape code setup being useful, but unless I'm misunderstanding a std::sync::Once check could probably work too? Maybe bad to do that initialization when panicking?

With the emacs crate, it looks like this would still want to use rust-ctor to get literal ctors even if we get support for global constructors in binary crates...

I guess my hope here was that by having global constructors be standalone functions rather than explicitly initializing variables, it would make it much harder to create a library which invokes undefined behavior when called before the global constructor is called. Like, we won't ever have any uninitialized statics unless someone explicitly uses MaybeUninit.

I'd argue that if we can encourage abstractions over this feature enough, leaving the order undefined could be entirely fine. In my perfect world no constructors would ever depend on one another.... Of course, that world won't exist.

One thing I'm worried about if we define the order, though, is seemingly arbitrary changes changing it.

For example, I'm extremely worried about becoming dependent on runtime order of global constructors within a crate- especially if that depends on something like the names of modules, or what order they're declared in. There is exactly one feature right now which depends on module declaration order, and that is macro declaration. When macro declaration fails though, we will get explicit compile time errors.

If a crate has an implicit dependency of one module's global constructor running before another, this all becomes much more hairy. What if rustfmt reorders the mod declarations? What if in refactoring, one module is renamed, making it sort differently and putting it above the other? These seemingly entirely innocent changes could break code depending on this order, and the error wouldn't be discovered till runtime.

Sure, this could happen with an undefined order too. However, a "defined" order which depends on easily changeable things like module order could lull users into a false sense of security worse than just not knowing any order at all.

Ehhh, or at least that's my scenario. Maybe that's unrealistic?


There are a few more posts that I haven't responded to here, will try to do that. Glad to have many more ideas in here! Hope this hasn't been too rambly.

As others have mentioned, something like Idea: global static variables extendable at compile-time - #2 by dtolnay might be a better idea. But if it is, I think exploring this one fully still has value.

Again, thanks!

2 Likes

No, typetag needs to create a list of impls of a trait. lazy_static and Once only run when explicitely asked to run, while typetag needs all impls in the whole program. To get it you would need global constructors, as they are the only way to run something without explicitly telling it to run.

1 Like

I too have been burned by C++ global ctors :older_man: and because of this, I would also rather see us go after distributed_slice - like features first.

But I also have an evil idea which I can’t resist suggesting: the order of execution of global ctors is specified, but what the spec says is, they will be executed in a different randomly-chosen order on every run of the executable. Thus, if they’re not all independent, you have a good chance of catching it during QA, and nothing comes to depend on some unspecified-but-usually-stable order.

4 Likes

Ah, sorry! I was trying to ask about libraries using rust-ctor for FFI initialization here.

I'm wondering if moving forward with distributed_slice instead of this would leave other libraries besides inventory and things depending on it without a cross-platform solution.

1 Like

I would like to provide my feedback on this pre-RFC (well, on the feature in general, rather) since it is very relevant to what we need.

Our main use-case currently currently is our error handling. In short, we have &dyn failure::Fail (which will be &dyn std::error::Error eventually) and we need to “adapt” it to some other “&dyn ExtraInfo” trait. The crux of the issue here is that we need to know which concrete error types do we have so we can check against them (via failure::Fail::downcast_ref). However, at the same time, the size of our system is such that we can no longer rely on manual registration of all these errors: it’s too error prone.

Another use-case would be our deserialization framework: it’s somewhat similar to typetag crate in its nature. Currently we still rely on “manual” registration, but this is blocking future work on making it more modular (for example, we might try dylibs for pluggable types).

Finally, we are starting to shape parts of the system where we would have “application” developers building “plugins” for our core: again, manual registration becomes too error prone.

Currently we use ctor crate (I didn’t know about inventory/linkme before). I think, linkme is perhaps would be the best solution for us, though.

Personally, I would be in favor of this design going in the direction of linkme crate, though, for the following reasons (all of them are mentioned in this thread):

  • dodges ordering issue (well, offloads it to the consumer – it’s up to you to figure out the order you actually want / if you care!).
  • avoids running arbitrary code without you noticing (though, I also buy the whole argument of trusting libraries – of course, they can still do sketchy things!)
  • still allows for running “constructors” via Once/lazy_static!, if desired.
  • it generally seems to be something that would be easier to agree on?

I would say, that it was a pretty frustrating experience figuring out our current solution and outside of ctor/inventory/linkme technique (“linker magic”), it seems like there are no good alteratives. However, all three of these have the same subtle failure mode in edge cases, which @Comex rightfully mentioned in this thread.

4 Likes

It is good idea for RFC, but I believe RFC should state how no_std environment will be able to solve the problem too, or if it cannot be solved, it should stated in RFC. Rather than having user to provide multiple functions that are marked with special attributes, it might be also better to have single global initialization hook, rather than multiple. Since the idea is to mark free functions rather than allowing non-const static initialization, I don't believe there is a need for multiple functions in single crate like that.

but all statics must still have some sane and initialized default.

Not necessary, since we have no concept of running destructor for statics, it is actually can be allowed for static to be uninitialized.

Go does this for its hashmaps- iteration order (as implemented, not as specified) occurs in a random order per process.

This makes it impossible to write a macro-by-example implementing gflags in Rust. I argue that gflags are the premiere use-case for library ctors.

Rust's HashMap does this too, at least with the default hasher, as it's randomly seeded.

Ah, I couldn’t remember if Rust did. I vaguely remembered some bug about collision attacks but I couldn’t remember if it was related.

The interaction with no_std is definitely something that would need to be figured out!

If this is implemented using the same infrastructure as C++ static initializers, then the infrastructure they would run on wouldn't depend on allocation or anything else in std. It could even still run without the rust initialization in #![no_start].

I guess if we're implementing it as part of the rust start code, then there's more to figure out here. Thanks for bringing it up!

With the multiple functions, this is necessary in order to allow things like inventory to function. If we were only allowed one function per crate, there wouldn't be a reasonable way for an attribute-based macro to register multiple things on independent items to run on startup. As this kind of "registration" is one of the main use cases, I would be inclined to say that having multiple functions is needed.

I just mean this as far as it is true in rust in general. The following isn't valid rust, and has never been valid:

static X: &str;

Global constructors wouldn't allow you to write this either - all statics would need to be initialized to something, still. Even std::mem::uninitialized() is non-const, so I don't think it's possible at all to do this?

Maybe using MaybeUninit, but even then, it's not really "uninitialized" since unsafe code is required to read it later. This is a concern, but I don't think it's a large one.

To be clear, I'm proposing not changing these rules. Global constructors would not introduce the ability to have uninitialized statics, nor take it away (since we don't currently have it).


With all that said, I think going forward with a distributed_slice or linkme-based method is probably more sane. I've not put together a proposal, but I'll probably be trying to do that later next month or sometime soon.

2 Likes

If this is implemented using the same infrastructure as C++ static initializers, then the infrastructure they would run on wouldn’t depend on allocation or anything else in std . It could even still run without the rust initialization in #![no_start] .

If it would work out of box that would be ideal, but I believe it leaves concerns for dependency on std as some statics may need memory allocations and friends. It is been a while since I looked at Rust runtime, but most likely panics would be UB in this case, but heap allocations are still should be allowed(need to check at which step global allocator is et up)

Even std::mem::uninitialized() is non-const, so I don’t think it’s possible at all to do this?

I believe a better support for global constructors would require it to become const, let say as extra feature. While it is true we can use Option and friends, it is overhead. In short ideally we need const way to have uninitialized static for global constructor to initialize

1 Like