Pre-pre-RFC: match ergonomics for container types — restricted method calls in patterns

scottmcm · November 16, 2020, 12:37am

This is an interesting point.

Note that in C++, you can project shared_ptrs, because they allow an aliasing constructor (#8). Arc/Rc just made different design tradeoffs.

So it might be something valuable to keep even if things like Box can't.

Oh -- and pin-project is a thing, which would be an interesting thing to have work through this feature...

amosonn · November 16, 2020, 2:17pm

That actually sounds quite cool! Of course, then you need more complex syntax, to specify what kind of projection you want - or alternatively, making the "cast" (via Deref, or maybe even AsRef) explicit, so that the projection is simply that of the type which you currently matched on.

scottmcm · November 16, 2020, 6:33pm

I think it would just be leaning in to binding modes. So if you match a Pin<T>, you'd get out Pin<U>s, the same as how if you match a &T you get &Us. (That would need GATs, I assume, to actually make work well.) And if you want to get &Us instead of SharedPtr<U>s, then you'd match on &*p or whatever.

I suppose that wouldn't solve the match-a-Box<T> problem that was the original point of this thread, though

steffahn · November 19, 2020, 12:56pm

I just had the idea that a deref syntax/feature for match would also allow something like:

match (expr1(), lazy(|| expr2())) {
    (Foo, _) => foo(),
    (Bar, deref Baz) => baz(),
    (Bar, deref Qux) => qux(),
}

mimicking a common idiom from lazy functional programming languages for avoiding nested matches while also avoiding unnecessary evaluation of expr2.

In case that isn’t clear: lazy would return a wrapper enum for the FnOnce() -> T that implements derefs to T and caches the result after the first call. The code above is supposed to be pretty much equivalent to something like this:

match expr1() {
    Foo => foo(),
    Bar => match expr2() {
        Baz => baz(),
        Qux => qux(),
    }
}

dhm · November 19, 2020, 9:17pm

Provided we had new (contextual) keywords, for sure the choice of the syntax does not really matter (at least to me), I just chose & since:

it does not involve / require an extra keyword,
using & / &mut allows to choose whether Deref or DerefMut is used.

As I mentioned, if we were to have DerefMove, then my whole "beautiful symmetry" vision would break, so if a deref / deref mut (and future possible deref move) keyword usage is deemed more appropriate, so be it.

That being said, , another argument against & / &mut is that, in the expr world, it's always * that is used to perform a dereference. In that regard, a single deref keyword would actually make more sense

The important part is to have a pattern dual of let binding = &*<expr> as let smth(ref binding) = <expr>; whether smth(<pat>) is &<pat> or deref(<pat>) is of no importance to me.

(That would need either the inexistent DerefMove, or to bind by ref / T : Copy, though).

amosonn · November 20, 2020, 11:43am

The starting point of that paragraph was agreeing to the "structural/pure dereference" idea, and to the fact that it doesn't matter what syntax that uses. I should have made it more clear.

One question that now arises though is, how pure should the implementation be for that to count? For supporting Rc and friends it has to at least allow access to arbitrary fields. But it would be cool if it could also support (slightly) more complex operations, such as checking an enum variant - for supporting Cow, for example.

Regarding the syntax of the match, I liked your distinction that * is used for all derefs. I guess that makes sense, as when borrowing you need to specify what kind of reference you want (&/&mut/Rc/...) but when dereferncing it's decided what implementation to use depending on what you need. So here the distinction between Deref, DerefMut and DerefMove (when it comes, or for Box right now) could be done by having

let deref (ref binding) = <expr>;
let deref (ref mut binding) = <expr>;
let deref (binding) = <expr>;

as the duals of

let binding = &*<expr>;
let binding = &mut *<expr>;
let binding = *<expr>;

I assume this is what you meant?

dhm · November 20, 2020, 4:09pm

Yes

Aloso · December 3, 2020, 6:06am

My idea is to use * for DerefMove, ref * for Deref, and ref mut * for DerefMut. For example:

match Some(Box::new(4)) {
    Some(*4) => ...
}

match Some(vec![1, 2, 3]) {
    Some(ref *[1, 2, 3]) => ...
}

match Some("hello".to_string()) {
    Some(s @ ref mut *"hello") => ...
}

This seems more consistent to me, because * in patterns works the same as in expressions.

CAD97 · December 3, 2020, 6:24am

Except that patterns are the dual of expressions.

let &x = &5; is x=5. Adding pattern *_ as "the same as" *_ in expression position rather than the dual would be a misstep that makes patterns harder to understand.

I feel that the best approach here is making a "DerefPure" that's guaranteed to just be field offsets and pointer dereferences, and allowing & destructing (and "match ergonomics" pointer chases) to just work with it. The existence of ref is odd enough that we introduced binding modes so you could write the pattern Some(x) instead of &Some(ref x).

I understand that the pattern binding mode model can be difficult to adjust to for developers used to the more explicit model. But I think embracing it is how at this point we get the most usable and consistent pattern match system.

H2CO3 · December 3, 2020, 8:33am

And that's explicitly the wrong thing to do, because patterns are not expressions.

H2CO3 · December 3, 2020, 8:36am

It's not "odd"; it's crystal clear to anyone who understands pattern matching and pointers. Match ergonomics was a highly controversial feature that was rushed to stabilization by means of pressure from the lang team.

Which is why I have to ask a related question: how does this:

amosonn:

let deref (ref binding) = <expr>;
let deref (ref mut binding) = <expr>;
let deref (binding) = <expr>;

as the duals of

let binding = &*<expr>;
let binding = &mut *<expr>;
let binding = *<expr>;

improve the situation, given that let &val = ptr; and let &Some(ref inner_ptr) = outer_ptr works today? In other words, what does deref do here?

amosonn · December 3, 2020, 1:50pm

only works when outer_ptr: &Option<MyStruct>, but not when outer_ptr: Box<Option<MyStruct>>. Currently in nightly there's also let box Some(ref inner_ptr) = outer_ptr, which would work for the second type but not for the first, and also not for e.g. outer_ptr: Rc<Option<MyStruct>>. The point of this thread is to suggest a syntax which would let generalize over this, since in the above expression I actually only care about the MyStruct part, not which dereference is done around the Option (as long as it's a pure or trivial one). Hence let deref Some(ref inner_ptr) = outer_ptr, which would work for all above types of outer_ptr.

rpjohnst · December 3, 2020, 5:19pm

It really isn't though- most experience with patterns comes from languages without pointers, and vice versa. My previous experience with both was insufficient when I first encountered ref- I had to look it up. So there's one data point against your claim, at least.

On top of that, ref has two more problems (and these are part of why I had to look it up!):

It's sort-of the counterpart to * in expressions, which means you can't "guess" it based on shared syntax like you can other patterns. Conversely, there's no ref expression.
It's not a full pattern, only a binding mode, because it's not actually matching on any sort of structural piece of the scrutinee. For example you cannot write ref Some(x), nor would you want to.

So ref truly is "odd" in the sense that it does not fit in neatly with the rest of the expression/pattern duality, at least not to the degree & does. (Also, "pressure from the lang team" to add a language feature? ) Anyway, I don't want to derail this thread into yet another argument about binding modes. I went to this level of detail because it helps explain what deref is for:

Binding modes arguably fit in better than ref- as the counterpart to autoref in expressions! Autoref inserts & sub-expressions to make an expression match the type of its context; binding modes insert & sub-patterns to make a pattern match the type of its scrutinee.

But there is a missing piece here. You simply can't match on owned types like Box, Rc, Vec, String, etc. There's no pattern to insert that would correspond to deref coercion in an expression. Thus this thread, which @amosonn summarized succinctly. It is really unfortunate that you can't use nested pattern matching on owned data structures- it leads to a lot of jumping back and forth between patterns and expressions, reducing the orthogonality of language features.

ekuber · December 3, 2020, 5:43pm

You are absolutely right that patterns are not expressions. That also means that all operators are available as tokens to be given new meaning in patterns . Not that I would propose such a thing .

Talking seriously for a second, a there are three problems with reusing operator tokens in patterns:

symbols are harder to search for documentation (same problem applies to expressions)
if they look too close to expressions it might make it harder for people to learn patterns (this is already an issue, we would make it slightly worse)
* in an expression is analogous to a ref in a pattern, so using it for the requested deref functionality would be a bad idea. Using any of the other existing tokens (!, %, ^, /, +, -, ==, ?) wouldn't be evocative enough. Using other tokens could be problematic or is already used (~, @, &, <, >, ., |, \). Introducing a context sensitive keyword could work even without an edition boundary, but the keyword will need to be bike-shedded.

On the other hand, if there's already a "social" expectation that Deref and DerefMut should be bounded in their scope and not do, let's say, network requests under the covers, then autoderef in patterns would be on the table. Whether this is a good idea (I'm hesitant, swapping my opinion depending on the dew point in my local area) is left as an exercise to the reader/lang team/eventual RFC thread.

Edit: just realized that we could reuse an existing keyword, like in, use or move, a reserved one like virtual, become or override, but none of them are great. We could also make ref be an opt-in for autoderef making the current behavior be left to match ergonomics, but at that point I'd rather have autoderef be part of match ergonomics, I think...

H2CO3 · December 3, 2020, 5:54pm

Please, speak for yourself — I do honestly find ref clearer than magic binding modes. On my phone right now, so won't address your other points in detail just yet, but this is one thing that I could not leave without comment.

H2CO3 · December 3, 2020, 5:55pm

Thanks for the explanation. The use case and the request for the Box example to work is clear to me now. (I still don't like the overall solution, because I don't find the syntax evocative of what's actually happening, but that's a different question.)

rpjohnst · December 3, 2020, 6:05pm

I didn't say anything about whether binding modes were clearer than ref, just that ref is certainly not made "crystal clear" simply by understanding patterns and pointers separately.

zackw · December 3, 2020, 7:44pm

Would code that uses this hypothetical deref, then, be generic over the container's type? I'm trying to understand why something like

let &Box(Some(ref inner_ptr_1)) = outer_ptr_1;
let &Rc(Some(ref inner_ptr_2)) = outer_ptr_2;

would not be a suitable solution.

(For purpose of this subthread, please ignore all the reasons why throwing a type name that isn't an enum variant in this position doesn't work right now and/or would be difficult to make work. I want to understand, first, why we need a way to do this without naming the container's type.)

steffahn · December 3, 2020, 8:23pm

I think the simple reason would be that defining one special pattern for something involving Deref[Mut] traits (which usually don’t behave too weird) is way easier than to go all in and introduce a mechanism to introduce fully customizable pattern synonyms that a type like Box or Rc would need to provide in order to make your example code work.

Well, if we had fully fledged pattern synonyms then the standard library could possibly also define one that gets called deref that just uses the Deref[Mut] traits internally accordingly.

Or do you suggest that we’d want all those Rc, Box, etc.. patterns all be hard-coded by the compiler?

ekuber · December 3, 2020, 8:32pm

Using let Box(Some(ref foo)) = bar doesn't work today because Box's value is not pub. Using Deref has a bunch of benefits:

We don't need to have a new rule on how visibility operates.
The concept of deref is already needed, explained and used in expressions, the feature would "just" expand it to patterns.
This would make it much easier to go from, let's say Box to Rc, and from Rc to Arc in refactorings. Match ergonomics has really made a good impact on these for me.
If we somehow change how visibility operates for patterns, then the patterns are now suddenly dependent on the inner structure of the types.
Other languages have (de)structuring through interfaces, like Scala's apply and unapply. We could introduce the concept of unapply, but that would likely require us to support structural records, and Deref is right there for us to use.

Topic		Replies	Views
Somewhat Random Idea: Deref patterns	37	5029	April 13, 2021
Making patterns more ergonomic language design	4	1047	March 25, 2019
Pre-RFC: View Patterns language design	11	2421	June 17, 2019
Match ergonomics 2024 poll language design	29	1332	October 1, 2024
Allow disabling of ergonomic features on a per-crate basis? language design	59	3911	March 25, 2019

Pre-pre-RFC: match ergonomics for container types — restricted method calls in patterns

Related topics