Idea: `where match` clauses

zrk · May 9, 2020, 10:40am

Hello everyone,

I wanted to share an idea I had about a formulation for specialization that wouldn't rely on overlapping impls. Given how much work has been put into specialization and how advanced design and implementation are, I don't believe this alternative formulation should be considered now, but I wanted to share it in the hope it would be useful nevertheless.

So, currently specialization relies on having overlapping impls where a "more specific" one can be chosen. In this alternative, there is no "more specifc" impls, and overlapping impls always result in an error like in today's Rust.

Rather, it allows to specify several distinct implementations when declaring an impl<T>, so that the correct implementation is chosen depending on the bounds on T or its concrete value. To do so, we allow a new syntactical construct, where match that allows matching on "type patterns".

For an example, let's consider the AddAssign trait from the specialization RFC:

trait AddAssign<Rhs=Self> {
    fn add_assign(&mut self, rhs: Rhs);
}

And the various specialized impls, using where match:

impl<R, T> AddAssign<R> for T
where match T {
    T: AddAssignSpec<R> => {
        fn add_assign(&mut self, rhs: R) {
            self.add_assign(rhs);
        }
    }
    T: Add<R> + Copy => {
        fn add_assign(&mut self, rhs: R) {
            *self = *self + tmp;
        }
    }
    T: Add<R> + Clone => {
        fn add_assign(&mut self, rhs: R) {
            let tmp = self.clone() + rhs;
            *self = tmp;
        }
    }
}

In this example, we where match on the T generic parameter, which allows us to choose an implementation of the AddAssign trait depending on the bounds on T.

In each "arm" of the match, we have a classic pattern => impl, where pattern is a new kind of pattern, a "type pattern", and impl is the contents of the item being defined (here, the contents of the trait implementation). The T type is evaluated against each pattern in each arm, and the selected implementation is the one of the first arm whose pattern matches T.

So, for instance, in the example above, if T implements the AddAssignSpec<R> trait, then the first implementation, that actually delegates to this AddAssignSpec<R> trait, is selected. This allows for specialization, as a type that wishes to specialize its behavior with regard to AddAssign just needs to implement AddAssignSpec<R>, and this impl will be chosen even if the type is Clone or Copy. The AddAssignSpec<R> trait is a regular, good old trait that requires implementing fn add_assign(&mut self, rhs: R). If the type does not implements AddAssignSpec<R>, the next pattern is evaluated, and so the impl is chosen if the type is Add<R> + Copy. If the type is not Add<R> + Copy, the third implementation is chosen if the type is Add<R> + Clone. Finally, if the type is none of the above, pattern matching fails, and compilation fails.

Now, I did not try to formalize what kinds of type patterns would be allowed for matching, but I guess a first list would be:

A concrete type, e.g. u32
A free identifier name (in the example above, T) with or without bounds (T alone matches anything).
_ (matches anything, not very different from T without bounds, but cannot appear on the right side)
Lifetimes appearing in types would not be allowed to be different from '_ (or elided), as matching on the lifetime would be error-prone (and I heard it might be hard on codegen too )

A where match clause could appear in trait implementations, in functions definitions or in trait definition.

For example, the example above could be rewritten more concisely as:

impl<R, T> AddAssign<R> for T {
    fn add_assign(&mut self, rhs: R) where match T {
        T: AddAssignSpec<R> => self.add_assign(rhs),
        T: Add<R> + Copy => *self = *self + rhs,
        T: Add<R> + Clone => { let tmp = self.clone() + rhs; *self = tmp; }
    }
}

The second motivational example in the RFC is that of the Extend trait, and could be rewritten as follows:

pub trait Extend<A> {
    fn extend<T>(&mut self, iterable: T) where T: IntoIterator<Item=A>;
}

impl<A> Extend<A> for Vec<A> {
    fn extend<T>(&mut self, iterable: T) where match T {
        T: ExtendSpec<Vec<A>> =>  iterable.extend(&mut self),
        &'_ [A] =>  { /* optimized implementation */ }
        T: TrustedLen => { self.reserve(iterable.size_hint().1.unwrap()); for x in iterable { self.push(x) } }
        T: IntoIterator<Item=A> => for x in iterable { self.push(x) }
    }
}

Again, a type can specifically opt-in specialization by implementing the trait ExtendSpec<Vec<A>>. Otherwise, if it isn't done, then if the type is &'_ [A], we can implement an optimized implementation (that the RFC doesn't provide and I'm too lazy to implement myself). Importantly, in this block, the iterable variable is known to be of type &'_ [A], which allows to use its methods etc. etc. Otherwise, we have yet another impl if T is TrustedLen, and finally the "default" impl if T is IntoIterator<Item=A>.

The specialization RFC allows to define "default partial trait implementations" for some types. This is possible with where match clauses, by allowing them in trait definitions.

Using again the example from the specialization RFC:

trait Add<Rhs=Self> {
    type Output;
    fn add(self, rhs: Rhs) -> Self::Output;
    fn add_assign(&mut self, rhs: Rhs) where match Self {
        T : Copy => { *self = *self + rhs; }
        T : Clone => { let tmp = self.clone() + rhs; *self = tmp; }
        _ => impl;
    }
   }
}

In this example, we provide some default implementations for add_assign, under some conditions. If the type is Copy, we provide the first implementation as default implementation. This doesn't force a Copy type to use this implementation, though, they can still manually redefine add_assign in their trait implementation. If the type is not Copy nor Clone, we use the _ => impl arm, where the impl keyword indicates that there is no default implementation provided and that, while types are allowed to implement this type, they should provide an implementation of the add_assign function. It is important not to forget that arm, as otherwise only Copy or Clone types would be able to implement the add_assign function.

So, what do you think? Has something like this already been suggested before (I couldn't find anything with a cursory search in internals)? Certainly there are 1000s of reasons why this design would not work in practice, but I thought the idea was cool, as I really like matching on values in rust, it is a very expressive construct, and so I thought it would be nice to be able to do the same on types.

As I said, I don't expect this to influence the language in any capacity, but feel free to discuss!

RustyYato · May 9, 2020, 10:44am

One issue with this solution is that it doesn't work across crate boundaries. I.e. a downstream crate can't specialize a trait impl. Also, simply disallowing 'a doesn't work becuse there are more subtle ways to specialize based on a lifetime. For example, woud (T, T) specialize (T, U,) if so, it would require lifetime equality. But this particular issue plauges the current implementation of specialization. (You can use min_specialization for a sound subset)

Besides that, I like this formulation.

Ixrec · May 9, 2020, 10:50am

This also seems to strongly imply disjoint / mutually exclusive impls, which are very much in the "maybe someday" pile because lang design is hard. See: https://github.com/rust-lang/rfcs/pull/1148#issuecomment-165425724. It's probably for the best that they're separate from specialization.

zrk · May 9, 2020, 10:52am

One issue with this solution is that it doesn't work across crate boundaries. I.e. a downstream crate can't specialize a trait impl.

My answer to this is the practice of having a TraitSpec trait for any Trait that has a blanket implementation. Is that not enough?

// Crate upstream
pub trait Foo { fn foo(); }
pub trait FooSpec { fn foo(); }

impl<T> Foo for T {
    fn foo() where T {
        T : FooSpec => T::foo(),
        _ => { println!("generic implementation") }
    }
}

fn foo<T : Foo>(t: T) {
    T::foo()
}

// crate downstream
struct A {}

impl upstream::FooSpec for A {
    fn foo() { println!("Specialized"); }
}

fn main() {
    upstream::foo(A); // prints "specialized"
}

Ixrec · May 9, 2020, 10:57am

I believe this does make it work again, though it's not nearly as intuitive or self-explanatory as the current proposal's "just write default impl if you want to allow specializations". I'm also not sure this would compose well when there's more than two crates involved. Besides, cross-crate specialization is arguably the use case for making specialization a first-class built-in feature, since basic specialization use cases within a single crate can already be worked around in pretty much exactly the same way.

zrk · May 9, 2020, 11:24am

Yes, of course, this is a hard issue, one for which I don't personally hope to ever find a solution... We could forbid anything that implies type equality (e.g., matching on (T, T)), but I'm not sure this is enough.

Or, we could allow them in the pattern, but fail the comparison for anything that requires lifetime equality.

Or just "bite the bullet" and pass lifetime information to codegen. But now this is an orthogonal matter.

H2CO3 · May 9, 2020, 5:55pm

I think until specialization is reasonably fleshed out, we shouldn't start designing and adding largely overlapping features.

zrk · July 5, 2020, 8:06am

For cross-referencing purposes: I added a link to this discussion in the tracking issue for specialization.

I don't really understand this comment. Once specialization is fleshed out and stabilized (there was discussions to stabilize min_specialization in the 2021 edition), won't that be too late for "largely overlapping features"? If anything, I would have expected people telling that it is already too late, with all the implementation effort that has been invested, not too early.

Specialization is a long requested feature that is taking time to implement, in part because it is definitely a huge undertaking (and I'm grateful to everyone making useful features happen in rust), in part because of the soundness issues that were discovered since the RFC was accepted. I think it may be useful to discuss alternative designs that would also allow for specialization. Even if the design doesn't directly solve the soundness issues (that would be too easy), taking a different perspective on the problem may allow us to find better solutions, maybe even applicable to the original formulation. Plus, the proposed design seem to have other advantages over the currently accepted design, such as less "action as a distance" (there is less "impl chasing" to be done in order to see which impl actually implies for a given type since upstream picks the order of specialization, and downstream crates just implement another trait themselves). Someone on the specialization issue also indicated that this could be extended to const functions on types (such as matching on the size of a type using the same where match construct).

I don't really see how the proposed design imply disjoint or mutually exclusive impls, can you maybe elaborate? It seems to me that the proposed design wouldn't really change a lot the current situation, the only difference is that a single impl could contain "overlapping bounds", which would be resolved by their order in match arms. Other impls would check for overlap like today, except that for impls containing a where match each arm of the match would be checked for independently.

This could be solved by teaching? If the specialization pattern described in the post were described in the guide, surely it would be more accessible? If this pattern is deemed very frequent, we could even later provide a macro to apply on your trait at definition time to generate the associated specialization trait.

I'm also not sure this would compose well when there's more than two crates involved.

This would definitely be a problem, but as of now I can't see why this would be different from the existing situation? Each downstream crate would be able to specialize its own types or traits, wouldn't they?

zrk · July 12, 2020, 9:19am

Hello again! I'm replying here to the comments I got on the Tracking issue for specialization github issue, because I don't want to derail their thread any further .

@nikomatsakis

Thank you for your interest in the idea, and for linking aturon's blog post! While I worked with the specialization RFC as base for this idea, I wasn't aware of how it fits in the larger picture of inheritance patterns. It is not surprising in the slightest, but this demonstrates that specialization has already been given a lot of thoughts from the rust team.

I tried to adapt the longer example from the blog post to using where match clauses, here's what I got at:

Somewhat long rust code using `where match`

// Node ////////////////////////////////////////////////////////////////////////

trait Node {
    fn parse_plain_attribute(&self, name: &Atom, value: DomString) -> AttrValue where match Self {
        T: Element => {
            match name {
                &atom!("id") => AttrValue::from_atomic(value),
                &atom!("class") => AttrValue::from_serialized_tokenlist(value),
                _ => fallthrough::parse_plain_attribute(self, name, value), // <- This is where we need a way to reference "the next impl"
            }
        } 
        _ => AttrValue::String(value)
    }

    // A very radical alternative that doesn't require fallthrough but is a bit far from the proposed `where match` construct:
    fn parse_plain_attribute(&self, name: &Atom, value: DomString) -> AttrValue {
        match (Self, name) {
            (T: Element, &atom!("id")) => AttrValue::from_atomic(value),
            (T: Element, &atom!("class")) => AttrValue::from_serialized_tokenlist(value),
            (_, _) => AttrValue::String(value),
        }
    }
    // additional virtual methods for Node
}

// non-virtual methods for Node
impl Node {
    fn is_parent_of(&self, child: &Node) -> bool {
        // ...
    }
    // additional methods here
}

// Element /////////////////////////////////////////////////////////////////////

trait Element: Node {
    fn as_activatable(&self) -> Option<&Activatable> where match T {
        T: Activatable => Some(self),
        _ => None
    }
    // additional Element methods
}

// non-virtual methods for Element
impl Element {
    fn nearest_activable_element(&self) -> Option<&Element> {
        // ...
    }
}

// Activatable /////////////////////////////////////////////////////////////////

trait Activatable: Element {
    // moar methods
}

// HtmlAnchorElement ///////////////////////////////////////////////////////////

struct HtmlAnchorElement {
    rel_list: DomTokenList,
    // moar fields
}

impl Node for HtmlAnchorElement {
    fn parse_plain_attribute(&self, name: &Atom, value: DOMString) -> AttrValue {
        match name {
            &atom!("rel") => AttrValue::from_serialized_tokenlist(value),
            _ => default::parse_plain_attribute(self, name, value), // here 'default' works the same without specialization
        }
    }
}

impl Element for HtmlAnchorElement {
    // ...
}

impl Activatable for HtmlAnchorElement {
    // ...
}

// HtmlImageElement ////////////////////////////////////////////////////////////

struct HtmlImageElement {
    url: Option<Url>,
    image: Option<Arc<Image>>,
    // moar fields
}

impl Node for HtmlImageElement {
    fn parse_plain_attribute(&self, name: &Atom, value: DOMString) -> AttrValue {
        match name {
            &atom!("name") => AttrValue::from_atomic(value),
            &atom!("width") | &atom!("height") |
            &atom!("hspace") | &atom!("vspace") => AttrValue::from_u32(value, 0),
            _ => default::parse_plain_attribute(self, name, value),
        }
    }
}

impl Element for HtmlImageElement {
    // ...
}

As you correctly determined, the difficult part is default, aka the ability to refer to "the next" (more generic) implementation. Indeed, in traditional match constructs in Rust, we don't have a way to "fallthrough" the next match arm (which may or may not be applicable depending on bounds in the case of where match) . This is made worse by the pattern of "specialization trait I propose", because in the implementation of the specialization trait, we don't have any explicit link to the base trait. The following example is an adaptation of the blog post's, and demonstrate the problem:

// Crate up

use bar::Bar;

pub trait Foo {
    fn foo(&self);
}

pub trait FooSpec {
    fn foo(&self);
}

impl<T: Bar> Foo for T {
    fn foo(&self) where match T {
        T: FooSpec => FooSpec::foo(self),
        _ => { 
            let bar = self.bar();
            /* do complicated stuff with bar */
        }
    }
}

// Crate down

use up::Foo;
use up::bar::Bar;

pub trait Baz {
    fn baz(&self);
}

impl<T: Bar + Baz> FooSpec for T {
    fn foo(&self) where match T {
        T : FooSpecSpec => FooSpecSpec::foo(self),
        _ => {
            self.baz();
            // Here, we'd like to reference the "generic implementation" of foo that "does complicated stuff".
            // I couldn't find a good option for this, here are some possibilities:
            // Option A: referencing the generic trait in the specialized trait "just works"
            // Very magical, and this is allowed today with another meaning (infinite recursion)
            <T as Foo>::foo(self);
            // Option B: still magical
            <T as default Foo>::foo(self);
            // Option C: explicit the implementation to use, but feels ugly
            <T where T : _ as Foo>
            // Option D: named implementations / conventions :'( or forced ...
            up::foo_base(self);
            // Maybe could do something with a proxy type that doesn't implement FooSpec, too?
        }
    }
}

Ideally, we'd be able to say "I want to call the more general implementation" (that being defined as "the next in the where match that is applicable to T) of Foo from FooSpec, without having to name that implementation. If someone has an idea on how to do this, I'd really like to hear it!

Other than that "thorny" problem, I don't see what existing specialization could do that where match couldn't wrt the larger inheritance story.

@Dark-Legion:

Thank you for finding this idea interesting! I'm open to starting an RFC for where match (maybe in this thread or in a different one as a Pre-RFC first?), but I think I need some advice on the "Motivation" section. To me, the ability to emulate specialization is a strong motivation for the feature. If we don't have this and want where match additionally to specialization, then I think we need another strong motivation for where match. First because it is an increase of syntax surface, second because of @H2CO3's reasonable remark that where match and specialization would be overlapping features, and I think we need a strong motivation before adding multiple ways of doing something. So, while I find where match intellectually pleasing, my main motivation is how it can allow for some form of specialization through the "specialization trait pattern". Do you (or others) see other use cases for this feature? On the github thread, the8472 mentioned a future extension of being able to use where match for matching on const expressions, such as size_of or align_of.

Again, thank for you interest! I may try and start writing a Pre-RFC, at least to specify a bit the bounds of this where match construct, and what a first iteration could be.

system · October 10, 2020, 9:19am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Specialization, coherence, and API evolution language design	9	1571	March 25, 2019
Another specialization discussion: what about add a `or` clause language design	19	863	August 3, 2023
Blog post: intersection impls language design	11	4747	March 25, 2019
Idea: where expression (as poor man’s specialisation) language design	7	867	April 26, 2023
Blog post: Maximally minimal specialization: always applicable impls	31	5865	March 25, 2019

Idea: `where match` clauses

@nikomatsakis

@Dark-Legion:

Related topics