[pre-RFC] lazy-static move to std

RalfJung · July 21, 2018, 11:13am

Yeah I don't see us even trying to support extern "C" in FFI any time soon. Probably never. Non-determinism would break coherence unless we take some super expensive counter-measures.

So, this seems like a good motivation for lazy_static in libstd.

The final output value will probably still have restrictions, @oli-obk knows the status there better. What I have been talking about (and I should have been clearer) is the data that's used during the computation. So you could e.g. build up a HashMap and fill it with all sorts of stuff and in the end compute some number (or some other Copy data), that would work fine.

However, given that you can only ever get a shared reference to a static, actually I see no problem with just putting the buffer part of the Vec into static memory. You can't move out of this or push to it or otherwise cause reallocation, after all.

newpavlov · July 21, 2018, 11:19am

I mostly was talking about heap allocation (and I as well should've been more clear) in the context of return types.

I think we should be very careful with this approach, as code can assume that the underlying memory is handled by allocator. But maybe returning &'static Vec<u8> will be fine, I don't know.

oli-obk · July 21, 2018, 11:37am

As an initial step we can allow

const fn foo() -> &'static [u8] {
    let mut v = Vec::new();
    v.push(42);
    v.leak()
}

and the leak method is the one from http://codyps.com/docs/leak/x86_64-unknown-linux-gnu/stable/leak/trait.Leak.html

And as a further step we can then consider a form of CtfeAllocator which allows us to return a Vec<u8, CtfeAllocator>. Although that is dangerous due to mutating methods and thus possibly impossible to safely

RalfJung · July 21, 2018, 1:31pm

No, that is not correct. For example, this is a fully safe function:

fn fake_a_box<'a>(x: &'a &T) -> &'a Box<T> {
  unsafe { mem::transmute(x) }
}

The matching thing for Vec is also legal. See Abomonation: terrifying serialization for more on this

AndyGauge · July 23, 2018, 3:27pm

I've removed the point about generated documentation and added

const fn definitions are compile time definitions that can compute statics. If these functions could perform heap allocations lazily, there would be no need for lazy_static! in std.

RalfJung · July 24, 2018, 9:08am

Well, not really no need, as @anp mentioned above. static initialized via FFI calls would still need lazy_static.

However, using a CTFE-initialized static is certainly preferable where possible (no run-time initialization checks needed).

AndyGauge · July 24, 2018, 3:01pm

The need to move lazy_static to std is based on wide adoption and doing something important to many developers. I’m presuming lazy_static crate would be the best place for FFI statics if the wide use case went away. I do not mean to say that lazy_static itself would go away.

Manishearth · July 24, 2018, 3:26pm

This RFC even quotes http://rust-lang.github.io/rfcs/1242-rust-lang-crates.html, which makes it very clear that std is an optional, hard-to-reach goal for a crate.

You’re skipping a stepshere – I’d petition for it to be moved into rust-lang first.

There’s very little motivation given here as to why lazy_static should be uplifted. The motivation given is very generic, stemming from What should go into the standard library? – it applies to many crates, not just lazy_static. Furthermore, that’s a discussion from 2015, and http://rust-lang.github.io/rfcs/1242-rust-lang-crates.html is the result of that, it’s been superseded by the RFC which basically explicitly considers std inclusion to be not a major goal.

These days the things that go in std are usually things that have to, like futures (for async support) and simd (since they need intrinsics)

I actually feel that petitioning for a lang-level addition will work better, here. Crates are assumed to be something you can easily import (not always true, but usually true), so uplifting them isn’t considered necessary or even a good thing (because now you’ve frozen the API). However, language additions can’t be done in external crates – only sometimes simulated as macros, so this may be work.

KodrAus · August 1, 2018, 4:00am

These days the things that go in std are usually things that have to, like futures (for async support) and simd (since they need intrinsics)

@Manishearth That's a fair starting point. In my opinion:

std is a beneficial home for lazy_static! because, as a quasi language feature, it's had to track several changes to the language that would be easier to manage if it were under the privileged environment that std offers. Its API has barely changed, but the implementation has several times.
lazy_static! is a worthwhile addition to std because, as a quasi language feature, it's a self-contained cross-cutting concern that has applicability as broad as std itself.

However, I do think we could explore the rust-lang alternative more too.

dcarosone · August 1, 2018, 6:09am

Pardon the dumb question, but if lazy_static! goes into std, how do I use it in #![no_std] contexts? Because embedded code certainly uses a lot of lazy_static!

I suppose you really mean into core, but please say so.

mcy · August 1, 2018, 6:17am

I think that it’s safe to assume when someone says “this needs to go into std!” and “this” doesn’t involve Box or std::sync shenanigans, they means “this needs to go into libcore!”

shrug

Tom-Phinney · August 1, 2018, 6:38am

It’s also a question of whether it needs to allocate memory. If it doesn’t, then it probably should go in libcore. If it does, then it won’t work in all [no_std] environments, so probably does need to be only in std.

mcy · August 1, 2018, 6:39am

Yeah, hence "Box shenanigans" (though we can split hairs and talk about Unique instead… =P)

luojia · August 1, 2018, 8:16am

Please refer to some conditions especially when developing operating systems. The macro lazy_static is somehow widely used even without the std crate, in another word with #![no_std] tagged. Moving it into std crate will make it impossible to use in such conditions. (We can still use extern crate lazy_static anyway.)

KodrAus · August 1, 2018, 11:12am

We are talking about libstd here specifically, because the #[no_std] support for lazy_static depends on the spin library. We don’t have the sync primitives in libcore to support lazy_static.

This RFC suggests we keep lazy_static in the ecosystem for no_std environments that could just shim the std one when available. I guess another alternative is to see whether we could interalize spin's simple Once implementation into lazy_static and make it available in libcore.

matklad · August 1, 2018, 3:36pm

Thinking about it a bit more, I want to argue that:

std must provide solution for use-case covered by lazy_static
lazy_static per se is not the best solution to these use-cases
with the recent stabilization of calling const_fn and removal of &'static from std::sync::Once, std can provide the best solution.

The TL;DR conclusion is that we should stabilize lazy_cell and not lazy_static!

Use Cases

Today lazy_static sovles two somewhat orthogonal use-cases:

lazy initialization
complex global data

Both of these use-case are pervasive and not domain specific (some language has build-in support for denoting laziness). Moreover, while they are easy to implement in other languages, in Rust they require unsafe code.

Note that const fn helps with the second use-case significantly, but still there are cases where you need runtime initialization (something like a global logger config, for example).

I feel pretty strongly that such generally useful unsafe code which expands the language vocabulary (as opposed to just making things faster) belongs to std.

Solution

Now, lazy_static! is not a really nice solution; suffices to point out that it is a macro. What if we could just have some sort of lazy value container, which could also be used as a static? I think we can have just that now (and such API was impossible before, due to &'static on Once and inability to call const fn).

Here’s my rough proposal, std::cell::OnceCell, a non-thread safe lazy value, and a std::sync::OnceCell thread-safe counterpart, both of which have const fn constructors. I don’t think that the following snippet is using unsafe correctly. I don’t even think it works, I’ve never tested it However, I am 90% sure that we have all the necessary language machinery to fix the bugs. And these all are basically copy-paste from lazy_static and lazy_cell.

Playground (EDIT: now with Drop)

Using this API, the example from lazy_static readme could look like this:

use std::collections::HashMap;
use std::sync::OnceCell;

pub fn hashmap() -> &'static HashMap<u32, &'static str> {
    static HASHMAP: OnceCell<HashMap<u32, &'static str>> = OnceCell::new();

    HASHMAP.get_or_init(|| {
        let mut m = HashMap::new();
        m.insert(0, "foo");
        m.insert(1, "bar");
        m.insert(2, "baz");
        m
    })
}

Note that the primary difference with lazy_static API is that the init function is supplied at the access time, and not at the construction time. This is a feature, which makes API more flexible: you can now close over stack data to initialize a static lazy value. It should be possible to provide a more traditional API on top:

struct Lazy<T, F: Fn() -> T> {
    cell: sync::OnceCell<T>,
    init: F,
}
impl<T, F: Fn() -> T> Lazy<T, F> {
    const fn new(init: F) -> Lazy<T, F> {
        Lazy { cell: sync::OnceCell::new(), init }
    }
}
impl<T, F: Fn() -> T> ::std::ops::Deref for Lazy<T, F> {
    type Target = T;
    fn deref(&self) -> &T {
        self.cell.get_or_init(|| (self.init)())
    }
}

However, I specifically propose not to add such Lazy API to the std just yet: it won’t work out quite well because you’ll need to spell out that F type for global values, which is awkward.

So, how do folks feel about RFCing the addition of cell::OnceCell and sync::OnceCell to std? Or am I missing some gnarling safety/usability hole?

matklad · August 1, 2018, 10:41pm

WAIT WAT? We actually can provide a nice API with stable Rust! We have associated constants & conversion from lambdas to function pointers!

Playground

use std::collections::HashMap;
use sync::{Lazy, OnceCell};

// macro and indirect fn call
static HASHMAP: Lazy<HashMap<u32, &'static str>> = lazy! {
    let mut m = HashMap::new();
    m.insert(0, "foo");
    m.insert(1, "bar");
    m.insert(2, "baz");
    m
};

// monomorpihized and macroless 
fn hashmap() -> &'static HashMap<u32, &'static str> {
    static HASHMAP: OnceCell<HashMap<u32, &'static str>> = OnceCell::INIT;
    HASHMAP.get_or_init(|| {
        let mut m = HashMap::new();
        m.insert(0, "foo");
        m.insert(1, "bar");
        m.insert(2, "baz");
        m
    })
}

Guess I am publishing a crate…

Centril · August 1, 2018, 11:05pm

Contributing to the bikeshed:

Thunk

matklad · August 2, 2018, 12:13am

Here we go:

https://crates.io/crates/once_cell

I would really appreciate a code review (EDIT: haha, found a dangling pointer dereference!)

EDIT: keep in mind that the crate requires beta rust, for a non-static Once. EDITEDIT: nope, it doesn’t require the beta anymore, although on stable you need to deref explicitly and can’t use non-static sync::OnceCell value.

EDIT: I probably should mention one improvement over the playground version. Lazy types are now parametrized by F: FnOnce() -> T = fn() -> T, which gives both sweat syntax for static situation, as well as direct call and ability to work with closures if you can avoid spelling out a type.

KodrAus · August 2, 2018, 12:49am

Nice! Looks like you’re already avoiding the soundness issue in lazy_static due to its public fields because the fields of once_cell::Lazy don’t leak their invariants.

Topic		Replies	Views
Lazy_static! native language design	5	518	May 23, 2024
Proposing lazy_static for the nursery libs	4	1875	March 25, 2019
Generic-type-dependent static data language design	8	6306	March 25, 2019
Feature Idea: Add a macro to "de-magic" box syntax language design	9	1233	April 10, 2021
Per-type static variables (take 2) language design	16	2100	April 1, 2020

[pre-RFC] lazy-static move to std

Use Cases

Solution

Related topics