How about "generic global variables"?


#1

A generic struct Foo<T> may need a generic global variable associated with the same T instantiated. Something like

/// library code

static mut FOO<T>: Option<Vec<T>> = None;

And for all instantiations of FOO::<T> there exists a global variable of the type Option<Vec<T>>. For example:

/// client code

let i32s_inited = FOO::<i32>.is_some(); // type annotation
let first_str: String = FOO  // type inference
    .map( |foo| foo.first()
        .map( |s| s.clone() )
        .unwrap_or_default() )
    .unwrap_or_default();

will produce two global variables, as if they have been declared previously:

/// library code

// not valid identifiers, just for demonstration
static mut FOO::<i32> : Option<Vec<i32>> = None;
static mut FOO::<String> : Option<Vec<String>> = None;

Since it’s not valid Rust syntax right now, we have to use trait as alternative to achieve similar result:

/// library code

pub trait GlobalFoo where Self: Sized {
    fn foo() -> &'static mut Option<Vec<Self>> {
        unimplemented!()
    }
}

pub struct Foo<T> { /* omitted */ }

impl<T> Foo<T> where T: 'static + GlobalFoo
{
    fn using_generic_global( &self ) {
        let foo = T::foo();
        // omitted
    }
}

/// client code

static mut FOO_I32: Option<Vec<i32>> = None;
static mut FOO_STRING: Option<Vec<String>> = None;

impl GlobalFoo for i32 {
    fn foo() -> &'static mut Option<Vec<i32>> {
        &mut FOO_I32
    }
}

impl GlobalFoo for String {
    fn foo() -> &'static mut Option<Vec<String>> {
        &mut FOO_STRING
    }
}

let i32_foo = Foo::<i32>::new();
i32_foo.using_generic_global();

let string_foo = Foo::<String>::new();
string_foo.using_generic_global();

The alternative method is more verbose.

So, is it worth introducing “generic global variables”?


Generic-type-dependent static data
#2

I have wondered about associated statics before, something like

struct Foo<T>(Option<T>);

impl<T> Foo<T> {
    static FOO: Self = Foo(None);
}

I’m not sure if there’s some reason this wouldn’t work though. First question is when do they get monomorphised? I think delaying it till all variants are collected in the binary should work.


#3

I’ve had this exact issue in my current project. I have an interning system that’s completely transparent… except that you have to run this macro for every type you want to use it with because I can’t have a monomorphised static. That means exposing the macro, plus all the internal types and interfaces. Blech.

Plus consts, while we’re at it. I had a macro-based RTTI system in stable Rust over a year (?) ago whose sole problem was that it needed monomorphised consts to not be hideously inefficient.

Actually, that reminds me: one nice thing D had was that you could have bare templates. Not templated classes, just templates. The closest approximation in Rust would be generic modules. You could shove anything you wanted in a template, and it would Just Work™. Come to think of it, there’s still a few things Rust could stand to learn from D, but that’s getting off-topic.


#4

Had this exact issue with my event bus crate. never used it since.


#5

I had this issue in my trees crate :slight_smile:


#6

(btw I don’t recommend returning &mut but that’s my opinion. it means any crate can reset the variable by calling the function and setting it to whatever. use an & and inner Mutex)


#7

I don’t bother writing perfect interface in the example code :slight_smile: , so It’s just for demonstration.

And in my case, the global variables are not for public use, only struct Foo<T> has access to them.


#8

static mut is still very problematic when encapsulated by privacy. For starters, you need to enforce synchronization, since other parts of the application may launch additional threads that call into the encapsulating code (so you’d have unsynchronized mutation). Furthermore, even if only a single thread is involved, code using static mut is almost certainly not reentrant because it’s UB to hold two &mut to the at overlapping times (see also https://github.com/rust-lang/rust/issues/53639).

Leaving static mut aside, this observation hits at the key problem I see with implementing generic statics:

Because we still (unfortunately) support the dylib crate type, there is not necessarily a single place where all monomorphizations are known, and as far as I know there’s no cross platform way to merge duplicate statics at the linking stage. So we can’t actually guarantee that there is precisely one FOO::<i32> that all code agrees on.


Note that it’s possible (with some run time overhead) to use TypeId and Box<Any> to collect “one value per type” in a single static. See https://crates.io/crates/typemap for example, though I think that’s unmaintained and I have not audited the implementation.


Associated Statics - a way to have high performance, (mostly) statically dispatched event busses
#9

why not generate functions? it’s still less overhead than AnyMap or w/e, but allows runtime monomorphization, I think.


#10

I’m not sure how useful this is to what you want, but we do have “generic consts” if you abuse the trait system:

trait MyConst {
    const VAL: T;
}

impl<T> MyConst for T {
    const VAL: T = ..;
}

<T as MyConst>::VAL

Of course, this only works because consts only live in the compiler’s imagination and don’t actually need to be monomorphized…


#11

Thanks for so many useful information.

No guaranteed monomorphizations of generic statics is really the deal breaker, which rules out declaring generic static in library code as an acceptable option.

Is it ok to resort to the alternative of declaring normal statics in client code and using trait system for simulation, if global variables are synchronized and encapsulated in such a way eliminating recursive function calls( since I know all the paths to access to these globals in my library )?

Using typemap or similar crates with some affordable run time overhead seems to be an acceptable option. But it just packed many global variables into one single global variable, the map itself. Quoting from typemap’s example code in its README:

    let mut map = TypeMap::new();
    map.insert::<KeyType>(Value(42));
    assert_eq!(*map.get::<KeyType>().unwrap(), Value(42));

The library users are not allowed to access to map since it’s an implementation detail.


#12

Because we still (unfortunately) support the dylib crate type, there is not necessarily a single place where all monomorphizations are known, and as far as I know there’s no cross platform way to merge duplicate statics at the linking stage. So we can’t actually guarantee that there is precisely one FOO::<i32> that all code agrees on.

How does C++14 deal with this issue ? The following C++14 snippet:

#include <limits>
#include <iostream>

template <typename T>
static auto foo = std::numeric_limits<T>::max();

int main() {
	std::cout << foo<float> << std::endl;  // prints 3.40282e+38
	foo<float> = 42.0;
	std::cout << foo<float> << std::endl;  // prints 42
	return 0;
}

appears to be very similar to a generic static mut in Rust.


#13

C++'s static storage class is internal to the current compilation unit, so it’s not at all comparable. However, leaving it out (template<typename T> auto foo = ...;) also works, which is interesting. I don’t have the time to dig in now, but one interesting scenario would be two dynamic libraries trying to share foo<int>, particularly on Windows. I’ll believe the linkers that are relevant for us can unify identically-named symbols statically linked into the same executable, but the setup for dynamic libraries may not be flexible enough to relocate all references to a global to a single instantiation.


#14

damn, you are right with static. For some reason, replacing static in the C++ snippet above with extern does work, but I have no idea how that works in practice (https://ideone.com/3kfXMJ), as in, whether you can then also write template <typename T> extern auto foo; in a different TU and use it.


#15

The fact that you can write template<..> extern does not implies that it is useful. For instance, you code only creates an instance of foo<float>, and nothing else (godbolt). If you create a shared object you will notice that every other specialization of foo that is not float will produce a link error.

In C++ there is no way you can put a templated declaration inside compiled code, for the simple reason that static and dynamic objects cannot contain generic code.

This is the main reason Boost ilbraries like Hana are header-only libraries. And this is obviously a huge pain in big projects using lots of metaprogramming, because compile times tend to increase a lot.


#16

In C++ you can use a template function declaration, without a definition, from any TU as long as the TU that contains the definition explicitly instantiates it for the types used in the other TUs. I stopped using C++ briefly after variable templates appeared, but I always supposed that, since they are templates after all, you only need a declaration to use them as long as the appropriate explicit instantiations are linked.


#17

You are right, but the library should have specialized the function/class for every possible useful type, otherwise the user will get some troubles at link time.


#18

Yes, you are correct, I don’t think something like that will work for Rust. In C++ there is one TU in charge of creating the symbol, and all other TUs must refer to it :confused:


#19

You don’t actually need C++14 to run into problems with this; even in C++98 you can have situations where something can be instantiated multiple times but needs to have a unique address across the whole program. This is known as “vague linkage”. Here’s a good post about the phenomenon and how it adds work for the dynamic linker:

https://ridiculousfish.com/blog/posts/i-didnt-order-that-so-why-is-it-on-my-bill-episode-1.html

On Linux (and other ELF systems) and Darwin, the dynamic linker properly uniques such symbols across dynamic libraries, despite the added cost, using so-called weak linking. Windows doesn’t bother, unless you manually mark one instantiation as dllexport and the others as dllimport.