Pre-RFC: Associated statics


#1
  • Feature Name: associated_statics
  • Start Date: 2018-05-14
  • RFC PR:
  • Rust Issue:

Summary

Traits and impls can have associated consts, but not associated statics. This RFC proposes to allow associated statics in traits and impls.

Motivation

consts are useful, but they are limited:

  • They can’t have some useful or necessary attributes applied to them (e.g. export_name, link_section, etc.).
  • They can’t refer to statics.
  • They can’t be imported from some external library.
  • They don’t represent a precise memory location.

These limitations apply to associated consts. Statics address these limitations, and associated statics in particular address these limitations with all the same benefits that associated consts have (compared to non-associated consts).

Guide-level explanation

Static items can be associated with a trait or type, much like const items can be associated with a trait or type.

A struct may contain associated statics:

struct CustomStruct;

impl CustomStruct {
    // Statics may have attributes (like #[no_mangle]) applied to them.
    #[no_mangle]
    pub static CUSTOM_STRUCT_ID: u32 = 13;
}

fn main() {
    println!("The struct's ID is {}", CustomStruct::CUSTOM_STRUCT_ID);
}

A struct’s associated statics may be provided by an external library by wrapping them in an extern {} block:

struct Struct;

impl Struct {
    extern "C" {
        static EXTERN_STATIC: usize;
    }
}

fn main() {
    println!("The extern static is {}", unsafe { Struct::EXTERN_STATIC });
}

Traits may also have an associated static (though the static may not be extern):

trait Trait {
    static ID: usize;
}

struct Struct;

impl Trait for Struct {
    static ID: usize = 15;
}

fn main() {
    println!("The struct's ID is {}", Struct::ID);
}

The trait may define a default value (and linker-related attributes) for the static, though the implementation may override these:

trait Trait {
    #[link_section = ".default_key_section"]
    static KEY: usize = 1;

    #[link_section = ".default_value_section"]
    static VALUE: usize = 2;

    #[link_section = ".default_key_ref_section"]
    static KEY_REF: &'static usize = &Self::KEY;
}

struct Struct;

impl Trait for Struct {
    #[link_section = ".custom_key_section"]
    static KEY: usize = 16;
}

fn main() {
    // Struct::KEY's link section is ".custom_key_section".
    assert_eq!(Struct::KEY, 16);

    // Struct::VALUE's link section is ".default_value_section".
    assert_eq!(Struct::VALUE, 2);

    // Struct::KEY_REF's link section is ".default_key_ref_section".
    assert_eq!(Struct::KEY_REF as *const _, &Struct::KEY as *const _);
}

Reference-level explanation

Due to a number of complexities and limitations, this RFC proposes that the feature be kept minimal initially. Associated statics are only permitted as follows:

  • Generic structs and traits cannot contain associated statics, even if the static does not use the generic parameter.
    • This is because generic statics cannot be supported on Windows in a dynamic library without explicitly declaring which monomorphizations are provided (like C++ requires).
    • Even if the static does not use the generic type parameter, it is unclear whether all monomorphizations should share a single static or if they should each get their own. For example:
      struct Struct<T>(std::marker::PhantomData<T>);
      impl<T> Struct<T> {
          static STATIC: usize = 42;
      }
      
      It is unclear whether Struct<u8>::Static should be the same static as Sruct<i8>::Static, or if they should refer to unique statics (in which case this hits the same problem on Windows as above). This can be addressed in a later RFC if desired.
  • The static itself cannot be generic. That’s a separate RFC.
  • Associated trait statics cannot be extern. For example, the following is not permitted:
    trait Trait {
        static TRAIT_STATIC: usize;
    }
    
    impl Trait for Struct {
        extern {
            static TRAIT_STATIC: usize = 13;
        }
    }
    
    This is not permitted because extern statics are unsafe, and using an extern static breaks the safety contract of the trait. The following is also not permitted:
    trait Trait {
        extern {
            static TRAIT_STATIC: usize;
        }
    }
    
    It could be permitted, but this RFC proposes we start by not permitting this and leave it as a later expansion point.

Drawbacks

Since this RFC proposes minimal support for associated statics, there are some unsupported situations that are unlikely to be obvious to users. The compiler should take this into consideration when emitting error messages, and perhaps links to additional information resources should be included in the diagnostic messages.

Rationale and alternatives

The rationale for this feature is largely the same as the rationale for associated consts, except associated statics will be more useful for FFI.

Prior art

Associated consts blazed the trail for associated statics, and this RFC attempts to draw as much as possible from the existing associated consts feature.

Associated statics have been discussed before:

The primary road block these discussions mentioned is the problem of monomorphizing generic statics in a dynamic library. This RFC attempts to avoid that road block by prohibiting generic statics.

Unresolved questions

None (yet).

Future possibilities

The syntax extern { ... } within a struct item opens the door for associated extern functions. This RFC does not propose adding these, but future RFCs may do so.

Generic statics might be feasible if they are not part of the crate’s public API (since all monomorphizations are known at compile time and the previously mentioned dynamic library is avoided). A future RFC could explore this further.

Associated statics for generic traits and types that don’t depend on a generic parameter (e.g., impl<T> Struct<T> { static DOES_NOT_USE_T: usize = 42; }) might be feasible to add in a future RFC.


#2

Literally the only reason I want associated statics.

(Also, they can be supported on Windows if you know what you’re doing.)


#3

I’d love to add generic associated statics, but I have to defer to others regarding dynamic library issues since it’s beyond my expertise. If a solution exists, and if it’s not a contentious point that holds up the RFC, I’d be happy to include it.

I expect there to be contention regarding monomorphization for associated statics that don’t depend on a generic parameter, though. Personally, I’d rather keep the initial RFC minimal and get something implemented and stabilized, and then follow up with future RFCs that expand the feature set.


#4

It’s worth noting in the RFC that a trait with an associated static cannot be object-safe, for the same reason a trait with any other associated non-fns cannot be object-safe.

(Also wait we’ve had inherent consts since 1.20??? My life continues to be a lie.)


#5

Can a static have where Self: Sized?


#6

Yes, it is possible to have a common DLL for handling generic statics on Windows. I can talk about this all day.

For one, let’s talk about Any a little. Any is currently unsound, as it uses hashes and stuff. But fixing Any on Windows and keeping it fast is a bit of an issue currently. However, by using the common DLL trick, this can be achieved quite easily.

Then there’s eventbus crate. It relies on the same proposed mechanism on Windows. As such, assuming eventbus currently works on Windows, we can actually prove that generic statics can work on Windows if you know what you’re doing!

The basic idea is that the common DLL looks something like this:

mod id_map {
    use std::marker::PhantomData;
    use std::sync::atomic::AtomicUsize;
    use std::sync::atomic::ATOMIC_USIZE_INIT;
    use std::sync::atomic::Ordering as AtomicOrdering;
    use std::sync::Mutex;
    use std::sync::Arc;
    use super::Event;
    use super::Handlers;
    lazy_static! {
        static ref EVENT_ID_MAP: Mutex<::anymap::Map<::anymap::any::Any + Send + Sync>> = Mutex::new(::anymap::Map::new());
    }
    struct EventId<T: Event + ?Sized> {
        id: usize,
        _t: PhantomData<dyn Fn(&mut T) + Send + Sync>
    }

    pub fn get_event_id<T: Event + ?Sized>() -> usize {
        static EVENT_ID_COUNT: AtomicUsize = ATOMIC_USIZE_INIT;
        EVENT_ID_MAP.lock().expect("failed to allocate event id").entry::<EventId<T>>().or_insert_with(|| {
            let handlers: Arc<Handlers<T>> = Default::default();
            Arc::make_mut(&mut super::HANDLERS.write().expect("???")).push(handlers);
            EventId { id: EVENT_ID_COUNT.fetch_add(1, AtomicOrdering::SeqCst), _t: PhantomData }
        }).id
    }
}

(this is an snippet taken straight from the eventbus crate. the real common DLL would use an “SlowAny” where this snippet uses “Any”, and “Event” would be replaced with the fast “Any”. one of the caveats is that std::any::Any would have to be the “SlowAny”, because the core::Any can’t work differently just because it’s Windows.)

Because this code would be in a common DLL, it would share properly, and all users would link to the same common DLL. Then, each user can ask for the TypeIds, and store them in a fast lookup scheme (I could talk about this all day, it’s basically a static attached to a function, but it uses a different static for each generic that gets passed to the function - each monomorphization of the same function leads to a different static).

(Ugh, I’m trying to explain this with simple words, but it’s hard.)

Anyway, the basic idea is that if you have a common DLL, say foo.dll, and you have 100s of DLLs that use foo.dll, then whatever is in foo.dll is accessible to all of those. if foo.dll has something like int errno; on it, then bar.dll can set errno=3 and baz.dll can read it back and get 3 as the value. we just use this on steroids.


basically, anyone talking about generic statics on windows should also mention how windows has an extremely non-compliant C/C++ implementation, especially with regards to errno, and also talk about what windows does to workaround the issue.


#7

No. Proof:

static mut UNSIZED: [u8] = [] as [u8];
unsafe {
  UNSIZED = *(vec![0; user_input()].as_ref());
}

Thus, the allocation for UNSIZED cannot be determined at compile time, so it can’t be shoved into .data or whatever link section you otherwise dream up, unless you believe that resizing a static into the heap is reasonable, which I argue is a hilarious portability disaster.


It’s not clear to me that what you’ve posted is an example of what you’re describing? The items of a function do not monomorphize together with that function:

fn foo<T>() {
    static mut K: i32 = 0;
    unsafe { 
        K += 1; 
        println!("{}", K);
    }
}
    
foo::<[u8; 1]>();
foo::<[u8; 2]>();

This code prints 1 followed by 2. Your example is somewhat misleading, modulo an extremely generous reading of “this is what I want this code to do”.


#8

I mean where Self: Sized so as to make the trait object-safe by excluding the static from trait objects. (same way you can have trait Foo { fn bar<T>() where Self: Sized; } and Foo is object-safe)


fn foo<T with static mut K: i32 = 0>() {
    unsafe {
        K += 1;
        println!("{}", K);
    }
}
foo::<[u8; 1]>();
foo::<[u8; 2]>();

This code would print 1 followed by 1 if we had these things. eventbus uses painful macros to make this work correctly, and everyone using those macros hates it. but there’s not much I can do in eventbus except push for this “generic parameter with an static pinned to it” thing. (this static then propagates to the caller if the caller is generic, and so on, until it hits a caller that calls it with a concrete type.)


#9

I read Sized? as ?Sized. Oops. Presumably not; you can’t currently do that with associated items that aren’t functions, so that should be its own RFC.

Unfortunately, that’s a breaking change, so I doubt you’ll be seeing function-item-monomorphization without awful contortions. It might be in your benefit to think of ways to make your crate work without getting niche compiler support.

I’d suggest forking this thread to continue discussing generic statics, since this proposal looks like it intentionally doesn’t want to answer those questions for the sake of avoiding exactly this bikeshed.


#10

Pay attention to the syntax. The static wouldn’t be in the function body:

fn foo<T with static mut K: i32 = 0>() {

Also, I don’t think fixing Any is “niche compiler support”. But then you really don’t need any of this to fix Any - you just need to make it excruciatingly slow.


#11

Thanks, I’ve made a note to explicitly mention that in a revised draft.

Good question. I’ve made a note to explicitly mention in a revised draft (under the “Future possibilities” section) that a separate RFC could explore permitting where clauses on trait const and static items to make the trait object safe. This separate RFC could be worked on in parallel with this RFC.

That’s straying pretty far from the RFC I’m proposing here. Let’s explore that (and generic statics) in a separate discussion thread or RFC draft. I’m intentionally trying to keep this RFC proposal as simple as possible and avoid bikeshed-inducing proposals. Currently, associated statics aren’t permitted at all, even in the simplest case. I’m trying to fix that in this RFC to do two things:

  1. Unblock users who only need the simplest case of associated statics.
  2. Open the door to exploration for more difficult cases (like generic statics). I believe it will be easier to explore these different cases once simple associated statics are supported.

Additionally, I’m hoping that by keeping the RFC simple we can get something implemented (and hopefully stabilized) sooner than if we started with the whole kitchen sink. I think the incremental approach here is the ideal approach.

With that in mind, I’m not going to discuss generic statics further in an attempt to keep this discussion focused. I appreciate the points you’ve raised, Soni, and I hope we can discuss and explore them further in a future discussion or RFC.


#12

Thanks for taking the time to write this! Here are some questions & notes I have to aid in making progress…

This section feels thin; before filing it against rust-lang/rfcs I’d encourage you to collect more motivation, especially real world use cases.

I’d like to see some discussion on why given this restriction it’s OK to have associated trait statics at all given that the first implicit parameter of a trait makes for an implicitly generic static you may end up in:

// crate A:
// ------------------------
pub trait Foo { static Bar: usize }

pub struct Alpha;
pub struct Beta;

impl Foo for Alpha { static Bar: usize = 42; }
impl Foo for Beta { static Bar: usize = 24; }

// crate B
// ------------------------
pub fn takes_foo<T: Foo>() { ... }

In particular, some reasoning around how we can guarantee that no matter where <X as Foo>::Bar is being projected, it always end up to the same memory address (which is something unsafe { ... } should be able to assume).

You’ve discussed the static semantics of your proposed changes, but not the dynamic/operational semantics. So even if it might be obvious, I’d like a discussion in this section of the proposal on the dynamic semantics of all the bits and pieces. You don’t have to specify it formally, but it should be clear what is meant.

I’d also move out any rationale for assorted choices in the reference to the section on… well… the rationale… :wink:

In all discussions hitherto about generic statics it was always clear that each monomorphization would get its own memory location… why else would you have a generic static? And furthermore, given associated types you may run into:

trait Foo {
    type Bar;
    static Baz: Self::Bar;
}

struct Gamma;
struct Delta;

impl Foo for Gamma {
    type Bar = u8;
    static Foo = 42;
}

impl Foo for Delta {
    type Bar = bool;
    static Foo = false;
}

Since if you’d have u8 and bool at the same memory location, type preservation is directly violated, and you’d get unsoundness.

Speaking of associated types; I think you’ll need to consider impl specialization + associated types and whether when generics are used it’s reasonable to expect that <X as Foo>::Bar always refers to the same memory address.

Some discussion on statics in C++ might be useful here.

That’s fair and I would be OK with experimenting with associated statics on nightly. However, I wouldn’t be inclined to stabilize associated statics without first having at the very least an accepted design + implementation on nightly for associated statics in generic implementations. I have this constraint to ensure that when we’re done with associated statics, what comes out is coherent and not a patchwork of inconsistencies.

I think the main hurdle for generic statics are dylibs (cc @rkruppe, How about "generic global variables"?, Generic-type-dependent static data, Generic-type-dependent static data, https://discordapp.com/channels/442252698964721669/443151225160990732/498975070518116365). We might want to kill them off entirely; but we should consider whether we are OK with preventing future better dynamic linking models (might also interact with hotloading a la erlang?).


#13

I don’t think there is an issue with dylibs, as I also mentioned the issues with C’s errno. We just need a simple workaround to do runtime monomorphization on subpar platforms. Everywhere else we can safely do it at compile-time. Also, don’t be scared by the term “runtime monomorphization” - it’s about as fast as RTLD_LAZY, that is, slow on the first time, but fast every other time.


#14

Thanks for the thorough review, @Centril! It’s going to take some time for me to incorporate it all into a revised draft.

Sorry the initial draft isn’t clear in this regard, but I’m actually not suggesting that Alpha::Bar and Beta::Bar share the same memory location. I would expect them to have distinct memory locations. In other words, each impl is treated separately and distinctly.

I would love to have a mechanism that would allow code to opt-in to merging some statics (which I’ve mentioned in the linkage feature tracking issue), but that’s a separate issue from this RFC.

I’m hoping to avoid discussing generic statics much, but I will say that there is some precedence for this position (not that I’m necessarily arguing for or against it; I’m merely pointing out that I don’t think the answer is universally obvious): if a generic function uses a (locally defined) static, there is still only one static created (which each monomorphization shares).

To be clear, I’m talking about when the static doesn’t use the structs/traits generic parameter(s) (much like how statics within a generic function don’t (can’t, currently) use the function’s generic parameters). If the static does use the generic parameters, then yes, I would expect each monomorphization to get its own memory address.

Again, I’m not arguing for any particular position here. I just expect we’ll have to discuss this in detail when generics and statics mix, and it’s a discussion I’m hoping to avoid for the initial implementation of associated statics.

Again, this isn’t what I’m proposing. Each impl's statics get their own (distinct) memory locations. I see no problem with the Foo trait and Gamma/Delta implementations; I think this code is compatible with this RFC.

That sounds reasonable to me and I’m perfectly fine with not stabilizing anything until the bigger picture is figured out. If possible, I’d still like to keep this RFC smaller in scope, though, and figure out generics in a follow-up addendum/RFC. Do you think the lack of generics would prevent acceptance of this RFC?


#15

Sure; I’m not saying you are; I’m merely pointing out what would happen if they would have the same memory location and why it isn’t tenable (so that you may use it as writing material… :wink: ) The point of this exercise is to show that even with:

pub trait Foo { static Bar: usize }

there might be problems because we can think of Foo::Bar as:

static FooBar<Self>: usize;

so you are back to generic statics again as far as I can tell.

I put it to you that having “maybe” as an answer would be difficult for users to learn; and therefore it seems to me that its easiest for users to understand any associated or generic static as having a unique memory location per monomorphization. If you want a single memory location instead you can just use a single non-polymorphic static.

At least from my part I wouldn’t block experimentation on nightly on lack of a design for generics.

However, I think that you should try to use the “future possibilities” or “rationale and alternatives” sections to discuss and think about generic statics. We don’t have to settle anything in the initial bid and it doesn’t have to be part of the guide/reference, but forward thinking would be helpful for reviewing.

(I generally appreciate thoroughly written and long RFCs because they typically shows that the author has put in the effort and has thought about corner cases and assorted interactions which is useful because it helps with holistic language design)


#16

One possibly-dumb question: what does this enable that an associated fn get_baz() -> &'static Baz does not?


#17

What does this enable that a global static does not?

Edit: actually I could think of an example, so this is now mostly a rhetorical question, but this should probably be clarified in the rfc.