Generic specialization

so take this code as a example of a behaver that would be good but does not compile

struct Container<T> {
    data: T,
}

impl<T> Container<T> { //make a default implimentation 
    fn report(&self) { //a function that can be overridden
        println!("type name: {}", std::any::type_name::<T>());
    }
    fn get(&self) -> &T { //another
        &self.data
    }
}
impl Container<String> { //override the default implimenation
    fn report(&self) { //Make a diffrent implimenation for the string that does custom logic
        println!("String length: {}", self.data.len());
    }
    fn tokenize(&self) -> Vec<&str> { //extra method that cannot be used with generics
                                      //that can only be used if you know it's a string
        self.data.as_str().split_whitespace().collect()
    }
    //get self is used from the default implimenation
}

so instead of this syntax which is safe although not valid in current rust instead we have to use a traits to do this or use fragile implimenations that are ugly to get this behaver

struct Container<T> {
    data: T,
}

impl<T> Container<T> { //default implimentation same implimentation as last time
    fn report(&self) { //a function that can be overridden
        println!("type name: {}", std::any::type_name::<T>());
    }
    fn get(&self) -> &T { //another
        &self.data
    }
}
trait tokenize { //we have to make a trait to get wanted behaver
    fn tokenize(&self) -> Vec<&str>; 
}
//now to impliment our override of the default implimentation
struct ContainerString {
    data: String,
}
impl ContainerString {
    fn report(&self) { //getting the custom report function but at what cost
        println!("String length: {}", self.data.len());
    }
    fn get(&self) -> &T { //have to reimpliment this to get the custom report
        &self.data
    }
}
impl tokenize for ContainerString {
//we get our wanted extra function here but with extra boilerplate
    fn tokenize(&self) -> Vec<&str> {
         self.data.as_str().split_whitespace().collect()
    }
}

while I could do something with autoderef the complexity remains and as far as I can see my design is still safe and does not break any existing code. additionally this is not sfinae as it is simple overridding of the a default implimentation. I do not develop rustc but do think that this could and should be at sometime be implimented into the language. I understand how this could become complicated and would like to know your thoughts

I believe this is covered by the in-progress, unstable specialization feature: Rust Playground

Edit: min_specialization is sufficient: Rust Playground

#![feature(min_specialization)]

struct Container<T> {
    data: T,
}

impl<T> Container<T> {
    fn get(&self) -> &T {
        &self.data
    }
}
impl Container<String> {
    fn tokenize(&self) -> Vec<&str> {
        self.data.as_str().split_whitespace().collect()
    }
}

trait Report {
    fn report(&self);
}

impl<T> Report for Container<T> {
    default fn report(&self) {
        println!("type name: {}", std::any::type_name::<T>());
    }
}

impl Report for Container<String> {
    fn report(&self) {
        println!("String length: {}", self.data.len());
    }
}

AFAIK the specialization feature is somewhat unsound due to specialization on lifetimes being insufficiently restricted - there’s ways to specialize on lifetimes which shouldn’t be allowed. I’m not sure of the current status of specialization is; maybe that’s been fixed? The tracking issue for the feature seems to indicate it hasn’t been fixed yet. However, min_specialization should be fine.

This is definitely something that the compiler intends to eventually support, but I assume it’s years away from stable.

However, specialization (and min_specialization) focus on specializing trait impls rather than inherent methods of a struct. I’m not sure if there’s a good reason for that, or if it’s simply not seen as important. Maybe specialization of inherent methods can be implemented as sugar for defining a private dummy trait with default fn items for the inherent methods. If specialization gets stabilized and nobody’s implemented this by then, I think it’d be worth pursuing.

2 Likes

This is already possible and very useful.

But adding specializations of inherent methods is simply a form of ad-hoc polymorphism (function overloading) – and function overloading infamously does not play well with type inference. To the degree that traits (type classes) were originally invented in order to solve the problem of overloading arithmetic operators (for different fundamental types) for Haskell! It's not impossible to reconcile the two, at least Scala supports overloaded methods, but it's not exactly trivial either.

specialization is very unsound due to its interaction with lifetimes. Nobody has figured out a good way to solve that problem.

min_specialization is a more limited version of specialization. It’s believed to be sound except:

There’s also ongoing work on a feature to check whether a type implements a trait. It’s not even in nightly yet though.

4 Likes

If I may ask why does specialization have problems with lifetimes? Also is there anyway to achieve similar behavior to min-specailaztion in stable rust without making wrapper classes

Castaway and similar crates can be used for some use cases. Not sure if it would work for your use case.

In short: due to function pointers like for<'a> fn(&'a str), there MUST be exactly one version[1] of a function for any possible generic lifetime instantiations. Combine this with the fact that you're allowed to write impl Trait for &'static str, a trait impl that is not always applicable to any valid instantiation of the generic Self type, and trying to specialize on the presence of a trait impl can run into cases where it cannot specialize a separate code path.

This becomes unsound because it requires post-generic-instantiation validation, and the compiler is very much not set up to handle such with the current monomorphization process. As a result, the version of a lifetime-specialized function that gets called is essentially random, potentially allowing you to get a non-'static lifetime the compiler let you handle as if it were 'static, thus a use after release.

Specialization on fully concrete types like String is a sound subset, and that subset is roughly what the min_specialization feature exposes (ignoring the unsound specialization traits that std allows for… reasons). Specialization on traits becomes doable if we get "specializable traits" which are required to be implemented using only specialization-safe bounds.[2]


If all of your types are 'static and specializations concrete, you can use "Any specialization", where you attempt downcasts of a generic type. This will trivially compile out in builds with any optimization as it's a branch on comparison of constant values.


  1. That version can be duplicated, but it's always the same code, just potentially at different instruction addresses. ↩︎

  2. As a rough example, consider TrustedLen. This trait still needs to be unsafe, but the specialization could potentially be checked in a future version of Rust by writing something like

    pub unsafe spec trait TrustedLen: Iterator {}
    
    unsafe impl<T: Sized> TrustedLen for Iter<T> {}
    impl<T: Sized> Iterator for Iter<T> { /* … */ }
    

    This would:

    • Tell the compiler that TrustedLen is safe for specialization with a baseline impl of Iterator.
    • For the impl of TrustedLen, verify that all satisfaction requirements are specialization safe.
      • Since the baseline impl Iterator is in the same crate, all bounds implied by Iterator (here just T: Sized) are considered specialization safe for the impl of TrustedLen.
      • If the baseline impl is in an upstream crate, that impls bounds are also required to be specialization safe at the impl TrustedLen site unless upstream's impl is marked such to make relaxing those bounds into a semver break, perhaps via the final keyword.
      • Specialization safe impl requirements are:
        • const generics/values,
        • lifetimes without any explicit bounds,
        • substituting a generic type parameter with a type that only captures specialization safe generics (if any),
        • requiring a spec trait bound, or
        • bounds implied by required spec traits' base bounds if and only if the providing impl is final (or local).

    Then there's the other kind of specialization, where you want to specialize on the addition of a trait instead of safety refinement, like perhaps:

    spec trait ExactSizeSpec: Iterator + spec ExactSizeIterator {}
    

    This form would again use Self: Iterator as the base bound, but the bounds required to satisfy impl ExactSizeIterator are still checked for specialization safety (either locally or by upstream final impl). (Although note that this specialization probably also wants to be an unsafe specialization trait so that it can actually use the size information in an unsafe manner; it's just a simple example.)

    I'm very slowly collecting all of these concepts into a blog post or two on a theoretically sound (and hopefully complete) system for full specialization. Maybe this will be the push to get me over the long persistent writer's block? (I incidentally solved my last hang-up — distinguishing between base and spec supertraits — while writing this footnote, hence the length.) ↩︎

6 Likes