Understanding Once

cklein · December 31, 2024, 12:21am

I am always curious how Once works. I could imagine the following pseudo code: If !inited Init() Return

Really nothing special here. But since rust make this part of its std lib, I am wondering if they do some magic to optimize the init check away after the first init()? I mean, modify the function pointer to point to return directly after the first init run? Thus the following calls will not check the init state again? Or even better, eliminate the call altogether? Of course the runtime modification is dangerous, but should be safe to do for rust low level lib?

steffahn · December 31, 2024, 12:41am

No, it's not about optimization, but about thread safety. That's why it's in std::sync. If you didn't need thread safety, it would indeed be (almost) as easy as your pseudo-code.

The thing that Once promises: If multiple threads try to init at the same time, only (exactly) one of them should actually do it and the others should wait until the initialization has completed.

Well okay, perhaps in a sense it's about optimization, too, but nothing like the suggestions you've made. If you consider thread-safety, then you could still follow the pseudo-code you've suggested, if a Mutex is incorporated into the process. But that's adding a lot more overhead to the init check, as soon as the value is already initialized. So compared to a naive mutex-based implementation, it's a question of optimization; when initialization had already happened, later initialization checks shouldn't involve more than a single read from an atomic flag.

cklein · December 31, 2024, 12:48am

Thread safety or not, it is not the issue here. I am referring to the stage after initialization is done. After it is done, we are still checking if it is inited, but we know if it is inited already. I am just saying init once, check million times, is kind of waste of computing power.

steffahn · December 31, 2024, 12:57am

The check isn't that expensive.

Your suggestions included ideas like "modify the function pointer", but the (fast path, successful) initialization checking code is generally inlined. There is no function pointer at run time, reading a function pointer and calling it would be more expensive than checking a simple status flag value.

"Eliminate the call altogether", not sure how that should work. Maybe in a dynamic runtime system with JIT compilation, one could re-write the assembly so that initialization checks for global singleton values that will never become de-initialized again can be eliminated. But not all usages of Once involve global singleton values, and Rust isn't doing a dynamic runtime with JIT compilation, anyway.

HjVT · December 31, 2024, 12:57am

It's exactly the sort of a check that CPU branch predictor is designed to make free.

cklein · December 31, 2024, 1:50am

Good to know. This makes sense. I was just wondering that this is too much waste no matter how fast it is. It is really should be zero cost! In real life, after I furnished my apartment, I don’t want to check whether I have a bed in my bedroom or not. Everytime I walk into my bedroom, I would close my eyes, and fall on the bed.

cklein · December 31, 2024, 2:58am

This is exactly what I am talking about. Based on what you said, the dynamic languages could be even more efficient since it can dynamically modify itself. I am just saying if Rust can do something like that, of course, in its own low level highly optimized codebase.

cklein · December 31, 2024, 3:00am

On a side note, I am wondering the following scenario: I have a async app, in my Main.cs, I can initialize all my readonly static variables, then all my futures, threads can start and use static variables freely/safely.

It seems Rust does not support this scenario at all? If my app is async, I HAVE to assume all static variables lives under a multiple threads environment, and must use LazyLock to init it?

Ddystopia · December 31, 2024, 6:05am

Linux kernel does this optimization and it produces real effect, here is Rust crate that does the same: GitHub - Evian-Zhang/static-keys: Static keys for Rust userland applications. Not sure how to use it to build Once and family.

But I think in case of Once that optimization makes no sense, because data and flags would probably end up in the same cache line / page, so you will touch them even if not reading the flag.

Ddystopia · December 31, 2024, 6:16am

You can do absolutely anything, just write your own unsafe abstractions. For that kind of things Introduction - The Rustonomicon is a compulsory reading.

Also, I advise to optimize only after measuring real effect - premature optimization is the root of all evil. You can easily do that stuff with unsafe, so just compare it to Once* version. Then, if there is a performance benefit, you may find yourself thinking of an optimization.

Ddystopia · December 31, 2024, 6:16am

By the way, why do you need to rely on statics? Why not pass &'static T to your futures? static_cell - Rust allows to to once get &'static mut T, then you can initialize it, convert to multiple shared references and share between futures

cklein · December 31, 2024, 6:29am

Fair enough. I am wondering that pass the variable around maybe even more expensive than static variable even it involves a flag checking?

Ddystopia · December 31, 2024, 6:47am

I don't see how can it be more expensive. &'static T is a pointer. Static variable is also accessed via pointer. The only overhead is passing that pointer around, instead of linker compiling. So it is basically zero.

Again, please do not make premature optimizations, they consume your and other's time and code clarity without any performance benefit.

jdahlstrom · December 31, 2024, 4:52pm

Indirection through a function pointer is not free either. In fact, in most scenarios it's overwhelmingly likely to be vastly less efficient than a simple, 100% predicted, extremely local branch. Both because it precludes inlining (which is an incredibly important optimization) and because function pointers are terrible from the perspective of a modern (ie. this millennium) CPU.

the8472 · January 1, 2025, 11:29am

Instead of rewriting assembly one can do page fault shenanigans to initialize a value once without using atomics. But this requires using userfaultfd (linux-specific), making the initialization code async-signal-safe, a helper thread or some other complications.

jumpnbrownweasel · January 2, 2025, 2:01am

Rust assumes that statics will be accessed by multiple threads concurrently. So what you described is not supported with statics. Rust is pessimistic because it is designed to support multithreading safely in all scenarios and it cannot not trust the programmer to avoid accessing a static variable at the wrong time.

If the cost of the OnceLock "is_initialized" check is unacceptable (however small), a practical compromise is to use static variables with OnceLock to share read-only data that can be initialized up front. But instead of requiring a check every time the variables are accessed, long running tasks/threads can obtain a reference from the static OnceLock variable when they start or at opportune times, and then reuse the reference freely without "is_initialized" checks from that point onward. The amortized cost of the check would then approach zero. This same approach can be used to share using Arc such that the reference count is only incremented once per task/thread, or at least a very small number of times relative to other work performed.

But you really should think carefully about the true cost of a single atomic load before worrying about trying to optimize for this.

Topic		Replies	Views
Idea: limited life-before-main for statics runtime initialization language design	19	2724	January 18, 2022
Helper for most common use of `std::sync::Once` libs	7	2738	March 25, 2019
[Blog] Safe pinned initialization language design	8	1809	January 4, 2023
Support c++ compiler flags for msvc windows compiler	4	1160	September 22, 2021
Usecase for write-only references: OnceLock libs	17	697	September 12, 2024

Understanding Once

Related topics