Pre-RFC: `#[must_bind]`

The other problem is that in a closure you can't return or ? from the surrounding function or break/continue from the surrounding loop.

Another solution would be a macro, e.g.

guard!(foo());
// expands to
let _guard = foo();

or

guard!(foo(), {
    // do something
}); 
// expands to
{
    let _guard = foo();
    // do something
}

Like @dhm s solution, this doesn't enforce correct usage. So even with this macro, a #[must_bind] attribute would be useful.

2 Likes

To reiterate a point already made in this thread: let _ = foo() does not “drop stuff directly”. It rather does “not bind” anything which may result in dropping the value if it was only a temporary. The most natural way to think about let _ = EXPR; is: it’s exactly the same as an EXPR; statement (and not the same as drop(EXPR)) except for the fact that let _ = EXPR; suppresses must_use warnings.

The reason for why let _ = EXPR; exists in the first place is rather for consistency since _ is an irrefutable pattern (which is of course much more useful in a match statement) and let supports all other kinds of irrefutable patterns, too.

As to why let _ and let _a must be different: That’s because patterns don’t need to move the value they match on but they can also just reference them: There’s ref PAT patterns and ref mut PAT patterns, so you can write something like

fn foo() {
    struct S{field: (bool,T)};
    impl Drop for S {
        fn drop(&mut self) {}
    }
    struct T;
    impl T {
        fn bar(&self) {
            println!("bar");
        }
    }
    let s = S{field: (false,T)};
    
    match s.field {
        (true, ref t) => t.bar(),
        _ => ()
    }
}

The match statement here inspects the field of s without trying to move it (which would be illegal since S implements Drop). Thus the pattern _ cannot do any moving here. You want to always support a default-case with _ in a match like the above and you want {let PAT = EXPR; BODY} to be same as match EXPR { PAT => {BLOCK} }, this the behavior of let _.

There could is a point to be made that ref patterns are just as confusing as let _ and both should be forbidden in a change that makes let and match always move their value (unless it implements Copy, like when it’s a reference, and with an automatic re-borrow inserted for mutable references). In this case, we could now choose to either have matching against _ behave like drop or like binding to some _a. let _ would either be useless now (when it’s like drop -> so we can disallow it and suggest to just use drop) or it wouldn’t be confusing anymore (if it binds to _a). But this isn’t going to happen anytime soon. One would also need to introduce an alternative mechanism for suppressing must_use, or, I suppose, one could use drop dropping must_use temporaries.

Huge language changes aside, something that we could probably do right now:

I’m not 100% sure on that, but wouldn’t let _ = EXPR; be the same as EXPR; in all the cases without a must_use warning, and the same as drop(EXPR) in all must_use cases? I suppose we could just add warnings against every let _ = EXPR; and either suggest EXPR; or drop(EXPR) for replacement depending on whether a must_use warning would be generated. In my opinion, if I’m not missing anything, this would make a lot of code a lot clearer.

7 Likes

That sounds correct to me. I could get behind this being a clippy lint, if there's a consensus that it is better style.

1 Like

I wonder why we have let _ = foo() to drop stuff directly rather than using drop(foo()).

Because of surrounding something with parentheses for adding drop(...) is more annoying text editing operation compared to adding things just to one side as in let _ = ...?

Also drop is rarely mentioned in compiler messages maybe, so it sometimes does not come to mind when I think "how do I do X? There should be some language construct for this.". Similar reason why |_|() exists. Took some time for me to familiarise that drop is not just some pecularity of libstd (i.e. "we can do such function then why not to do?"), but a thing for me to use. Then it also took some time to realise that drop is in prelude and I don't need to write std::mem::drop every time.

Maybe drop(create_semaphore_guard()); should cause the warning as well, leaving the only way to really silence the warning the explicit #[allow(unused_must_bind)] create_semaphore_guard();? (not counting workarounds like user-defined fn my_drop<T>(t:T){})

2 Likes

I think that's going too far. If someone really wants to silence a warning, it should be straightforward, and not require #[allow(...)]. Besides, std::mem::drop() is not at all special, so there is no reason to special-case it.

1 Like

I really dislike how Swift changes the lifetime of let _ = foo() based on whether or not builds are optimized. This caused a huge headache at work a few months ago and wasted a lot of someone else's time debugging, and then later bit me again. Admittedly this involved a library that was wrapping some C++ code and not forcing usage through a guard as suggested above, which would have prevented the issue. But I do think it runs afoul of my expectations that code behavior should be the same regardless of optimizations. In this case that would presumably mean the shorter lifetime in both versions, as this does seem like a valuable optimization for not wasting stack space on no longer used variables in the same scope.

To the Rust topic at hand, until reading this thread, I expected let _ = foo() to keep the return value of foo() alive until the end of the scope, and the current behavior seems unintuitive. However having minimal lifetimes does seem like a good optimization opportunity (provided it does it in debug too). It would be good to expectation set that scoping is just an organizational/visibility thing and not a lifetime thing though if that's the case.

Regarding #[must_use], I think it would be nicer if there was an explicit way to indicate "ignoring result" that didn't look like a variable binding. I don't have any suggestions though.

It's horrible and don't do this, but I've written

do_fallible_operation().ok();

for this before and it does "work" for this purpose.

1 Like

(supposedly .ok() is an "accepted" way to "use" a result, I disagree and really think these should be marked as #[must_use] as they're otherwise no-op function calls, and the rest of Result's no-op functions are marked).

1 Like

14 posts were split to a new topic: Must_use and linear types

Note that #[must_bind] might not solve the problem for a destructuring assignment, e.g. let (a, b, _) = foo(), where fn foo() would be marked #[must_bind]. I was recently bitten by this: the tuple element not-bound to _ actually needed to stay alive until the end of the block, resulting in a couple of hours of debugging.

Off-topic musing:

It would be great if let (_, _) = foo(); were actually semantic sugar for let (_autogenerated_3482, _autogenerated_8904) = foo();, so that it would have the same semantics as let (_a, _b) = foo(). (This was my mental model of let _ until I encountered this behavior.)

1 Like

I think it could (where to me "solve" here means helpfully emitting a warning) if it was attached to the type itself rather than to the function.

I think using a standalone Mutex to protect network or filesystem resources is a relatively common pattern. I agree with folks who think it's a code smell -- there are probably better ways of doing things -- but it's such a ubiquitous pattern in other languages that I expect people will always want to do it in Rust too. I can't imagine a case where let _ = my_mutex.lock().unwrap(); is correct code, and it seems like it would be nice to warn for that, with very little downside.

Another good use case for this feature might be CString. A common gotcha in unsafe code is to write something like let s = CString::new(...).as_ptr();. That immediately drops the temporary CString and makes s a dangling pointer. This can be valid if you're immediately using the pointer on the same line, but I think even there it's a bug waiting to happen, when someone comes along and helpfully refactors your function call into a few separate lines. My personal rule is that CString in particular should always be bound to a local variable. If that's a common rule for other people, it might be nice to turn it into a compiler warning.

Maybe we could generalize from there to think about the category of "things that are usually created so that you can extract raw pointers from them." Another example that comes to mind is MaybeUninit. I've never run into this problem personally, but I can imagine a similar issue with someone writing MaybeUninit::uninit().as_ptr().

5 Likes

I think 99.99999% of users expect _ to just mean "I'm not bothering to name this thing" even if there is technically documentation for this behavior. It's a violation of principle of least surprise for it to change object lifetime; I'd expect the semantics to be the same as if a random unique name replaced the _. Nobody coming from C++ for example is going to expect a "change in variable name" (not what it is but what most people will think of it as) to cause changes in runtime behavior.

5 Likes

If we imagine _ (wild pattern) like a black hole, it all makes sense as there are no way to get back the value after binding with _.

So I think a clippy lint warns when bind a Drop-able value to _ is nice.

1 Like

The problem with "everyone expects _ to prolong an object's lifetime" is that _'s current behavior is absolutely critical when the binding would be by-move. If _ became _random1234, then any code trying to partially match on an object in the default binding mode will break. For example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=622dcd7d1a978cc56e9ef8f139e3fcff

Rust thus needs some syntax for _'s current behavior- it's just too useful a tool in pattern matching to give up. Further, _ is a long-established way of doing this in languages with pattern matching, so renaming it to give _ your "expected" behavior will really only shift the confusion to a different set of people, "violating the principle of least surprise" all over again!

This is why lints are a good solution- they find the places that people aren't getting the expected behavior and teach them to use the existing mechanism for the behavior they want.

13 Likes

FWIW, let __ = ... Just Works :laughing:

More seriously, the pattern in Rust is let _guard, since a value whose only purpose is to do something at the end of the scope through RAII / drop is called a guard.

Actually :thinking:, back to let __, it is currently a horrible hack due to it looking too similar to let _, but we could imagine, within a new edition, to warn against all usages of let _ = (no type annotation, specific let), suggesting to either use let __ / let _guard for the guard pattern, or to use drop to silence a #[must_use]. I do agree that _ as non-binding pattern is just straight up necessary, as @rpjohnst showcased in their example, but let _ itself, without a type annotation, as seen in this thread, is actually a code smell.

For those feelingthat special casing let _ vs. other pattern cases, feels inconsistent, it turns out that it's not really the case! Indeed, the other cases are the following:

  • function parameter, such as fn (Pattern(_): SomeArg) { /* scope */ },

  • if/while let, such as: if let Pattern(_) = ... { /* scope */ };

  • for, such as for Pattern(_) in ... { /* scope */ };

  • match, such as match ... { Pattern(_) => { /* scope */ }}

have all in common that within the /* scope */, the value matched against _ cannot possibly have been dropped!1, which is different than the let Pattern(_) = ...; /* scope */ case discussed in this thread.

1 Yes, it is non-trivial but true (Proof). The fact it is non-trivial is yet another argument w.r.t. let _ being the code smell case.

4 Likes

As you describe, if you use a temporary as the scrutinee, and you compare its drop scope to the pattern binding scope, you do get different behavior across functions/if let/for/match vs let.

However, this appears to be an intentional design choice: Temporaries are consistently dropped at the end of the statement, but only let statements end before their binding scope does. And for good reason- most expression-based languages include let's binding scope in the let expression, as in let pat = val in expr, but in Rust this is tantamount to applying temporary lifetime extension to all temporaries!

The deeper consistency in behavior here is not in whether the scrutinee has been dropped, but whether it has been bound. Binding is the way values are moved, which means _ for temporaries is just refraining from "moving" out of the temporary, and de-special-casing the order of "dropping let temporaries" with "executing let's binding scope" is just special-casing let statements' temporary drop scope to expand beyond the statement!

Thus, the better way to resolve this inconsistency might arguably have been to consider "the rest of the block" as part of the let statement, which would have let us ditch the special case of temporary lifetime extension, and always drop _/..-bound temporaries before executing the bodies of all of functions/if let/for/match/let. (This of course would further cement let _ = as perfectly idiomatic way to ignore a value :slight_smile:)

3 Likes

I’m not 100% sure this is the true reason. At least for if let and while let the temporary lifetime seems to be mainly because these statements are equivalent to match statements, especially if let PAT = EXPR { ... } else { ... } is 100% the same as match EXPR { PAT => { ... }, _ => { ... } }.

And in match statements you need the temporaries because ... well ... you need to keep the value being matched against because there can be multiple arms and special treatment of the last arm would be inconsistent, and you need all the other temporaries because you don’t want all the error messages about temporaries being dropped while borrowed that you would otherwise get when matching against dereferenced mutex or refcell guards etc.

for expressions seem to have a similar reason to match why they need to keep temporaries. The reference presents a translation that keeps the temporaries on purpose.


So now though to my main point ... today I learned:

if EXPR { BLOCK1 } else { BLOCK2 }

is not the same as

match EXPR {
    true => { BLOCK1 }
    false => { BLOCK2 }
}

To expand @dhm’s playground: ⟶ like this, we see that an if statement does drop the temporaries of the condition expression. So a proper translation would be

{
    let cond = EXPR;
    match cond {
        true => { BLOCK1 }
        false => { BLOCK2 }
    }
}

It the rule was about “temporaries are dropped at the end of the statement”, this wouldn’t make any sense. So I guess the actual principle is: Temporaries are dropped at the end of the statement expression for some kinds of expressions where dropping them earlier would cause to many problems.

Now for if expressions, the condition is always an owned bool, so there cannot possibly be any programs not compiling because the condition’s temporaries where dropped earlier.


Honestly, my personal feeling about this is that by pretending there was a consistent and simple rule about temporaries an the end of statements/expressions, we only get less optimization without much actual gain. But it is probably hard to change anything now. The most confusing rule that we have right now, IMHO, is how blocks work. A block { STMTS; EXPR } keeps the temporaries of EXPR in the surrounding expression. This makes { EXPR } similar to a call to an identity function id(EXPR) in that it moves the expression, but it makes it behave different than loop { break EXPR } or (|| EXPR)() or if true { EXPR } else { unreachable!() } or match { _ => { EXPR }} which all drop the temporaries.

It also gives rise to a weird rule that temporaries of a final expression in a function (and probably also a closure) are dropped after the local variables are, in effect this means that

fn blah() {
    STMTS...
    EXPR
}

is different from

fn blah() {
    STMTS...
    return EXPR;
}

(proof)

3 Likes

But you also quite frequently need to keep the temporaries of a let statement! That's why we have temporary lifetime extension to begin with. So why doesn't this match logic apply to let? Because, again, Rust intentionally diverges from the typical let/in to get smaller drop scopes.

An important thing to note here is that if expressions do not have a binding scope, which puts them in an entirely different category than the constructs (functions, if let/while let, for, match, let) I was discussing.

Another important difference around if is that it tends to be used in long chains of else ifs, and while you can view these as grammatically nested, that is not how programmers think about them. Thus these two sub-statement temporary drop scopes in the docs I linked:

Arguably, { ...; EXPR } is just Rust's syntax for the in of let/in- the block provides the binding scope for any contained lets. Viewed from this angle, the let special case is quite similar to the behavior of if- temporaries in the scrutinee are dropped early because both constructs tend to have long-running "bodies."

So if you wanted you could rewrite that section of the reference to group the RHS of a let in with the condition expression of an if. Perhaps you might also rewrite the section on temporary lifetime extension in terms of these shortened temporary drop scopes, and my hypothetical _-drops-temporaries proposal in terms of applying this same shrunken drop scope trick to more constructs.

I know! That’s why my takeaway is that we shouldn’t even try


as can if let statements, that keep the temporaries around throughout all the else ifs and else branches nontheless.