Pre-RFC: `#[must_bind]`

I wonder why we have let _ = foo() to drop stuff directly rather than using drop(foo()). I also bump into this thing when reading an article that _ will drop immediately, otherwise I wouldn't have know about it.

I believe this may be a potential footgun for users since the compiler may compile but the code may not run correctly, one could also argue that it is a logic error but the one new to rust may not know about this.

3 Likes
{
    let _ = guard();
    body();
}
println!("--");
{
    let _guard = guard();
    body();
}
println!("--");
match guard() { _ => {
    body();
}}
println!("--");
guard().with(|_| {
    body();
});
displays
--
Dropped 1
Body 1
--
Body 2
Dropped 2
--
Body 3
Dropped 3
--
Body 4
Dropped 4
--

What I mean by that is that there are other patterns in Rust besides let _ and let _guard. We should look at the last example, for instance, from which we easily come up with the following pattern: MutexGuard<()>-like "empty" guards, rather than marked #[must_bind], could instead use the callback pattern (also called scoped API).

That is, instead of:

#[must_bind]
fn lock (...) -> Guard...

have:

fn with_lock<R> (..., f: impl FnOnce(Guard...) -> R) -> R

This has the advantage of not requiring any new language / lints addition, just that the library authors design their APIs like this. In general having callback APIs can be cumbersome / noisy because of the scopes it requires to write (c.f., thread_local!s), but in this case, having a well-defined scope is precisely what we want to be doing to begin with!

EDIT: I forgot about async contexts; those would require a dedicated function taking a future instead of a callback (impl Future<Output = R> + ...), which would mean the future would not have access to the guard and they would thus lose the ability to drop it earlier if needed. Still, not that big of a loss in general.

2 Likes

I believe this may be a potential footgun for users since the compiler may compile but the code may not run correctly, one could also argue that it is a logic error but the one new to rust may not know about this.

That's why I think a special attribute just for a warning message may be worthwhile.

1 Like

The other problem is that in a closure you can't return or ? from the surrounding function or break/continue from the surrounding loop.

Another solution would be a macro, e.g.

guard!(foo());
// expands to
let _guard = foo();

or

guard!(foo(), {
    // do something
}); 
// expands to
{
    let _guard = foo();
    // do something
}

Like @dhm s solution, this doesn't enforce correct usage. So even with this macro, a #[must_bind] attribute would be useful.

2 Likes

To reiterate a point already made in this thread: let _ = foo() does not ā€œdrop stuff directlyā€. It rather does ā€œnot bindā€ anything which may result in dropping the value if it was only a temporary. The most natural way to think about let _ = EXPR; is: itā€™s exactly the same as an EXPR; statement (and not the same as drop(EXPR)) except for the fact that let _ = EXPR; suppresses must_use warnings.

The reason for why let _ = EXPR; exists in the first place is rather for consistency since _ is an irrefutable pattern (which is of course much more useful in a match statement) and let supports all other kinds of irrefutable patterns, too.

As to why let _ and let _a must be different: Thatā€™s because patterns donā€™t need to move the value they match on but they can also just reference them: Thereā€™s ref PAT patterns and ref mut PAT patterns, so you can write something like

fn foo() {
    struct S{field: (bool,T)};
    impl Drop for S {
        fn drop(&mut self) {}
    }
    struct T;
    impl T {
        fn bar(&self) {
            println!("bar");
        }
    }
    let s = S{field: (false,T)};
    
    match s.field {
        (true, ref t) => t.bar(),
        _ => ()
    }
}

The match statement here inspects the field of s without trying to move it (which would be illegal since S implements Drop). Thus the pattern _ cannot do any moving here. You want to always support a default-case with _ in a match like the above and you want {let PAT = EXPR; BODY} to be same as match EXPR { PAT => {BLOCK} }, this the behavior of let _.

There could is a point to be made that ref patterns are just as confusing as let _ and both should be forbidden in a change that makes let and match always move their value (unless it implements Copy, like when itā€™s a reference, and with an automatic re-borrow inserted for mutable references). In this case, we could now choose to either have matching against _ behave like drop or like binding to some _a. let _ would either be useless now (when itā€™s like drop -> so we can disallow it and suggest to just use drop) or it wouldnā€™t be confusing anymore (if it binds to _a). But this isnā€™t going to happen anytime soon. One would also need to introduce an alternative mechanism for suppressing must_use, or, I suppose, one could use drop dropping must_use temporaries.

Huge language changes aside, something that we could probably do right now:

Iā€™m not 100% sure on that, but wouldnā€™t let _ = EXPR; be the same as EXPR; in all the cases without a must_use warning, and the same as drop(EXPR) in all must_use cases? I suppose we could just add warnings against every let _ = EXPR; and either suggest EXPR; or drop(EXPR) for replacement depending on whether a must_use warning would be generated. In my opinion, if Iā€™m not missing anything, this would make a lot of code a lot clearer.

10 Likes

That sounds correct to me. I could get behind this being a clippy lint, if there's a consensus that it is better style.

1 Like

I wonder why we have let _ = foo() to drop stuff directly rather than using drop(foo()).

Because of surrounding something with parentheses for adding drop(...) is more annoying text editing operation compared to adding things just to one side as in let _ = ...?

Also drop is rarely mentioned in compiler messages maybe, so it sometimes does not come to mind when I think "how do I do X? There should be some language construct for this.". Similar reason why |_|() exists. Took some time for me to familiarise that drop is not just some pecularity of libstd (i.e. "we can do such function then why not to do?"), but a thing for me to use. Then it also took some time to realise that drop is in prelude and I don't need to write std::mem::drop every time.

Maybe drop(create_semaphore_guard()); should cause the warning as well, leaving the only way to really silence the warning the explicit #[allow(unused_must_bind)] create_semaphore_guard();? (not counting workarounds like user-defined fn my_drop<T>(t:T){})

2 Likes

I think that's going too far. If someone really wants to silence a warning, it should be straightforward, and not require #[allow(...)]. Besides, std::mem::drop() is not at all special, so there is no reason to special-case it.

1 Like

I really dislike how Swift changes the lifetime of let _ = foo() based on whether or not builds are optimized. This caused a huge headache at work a few months ago and wasted a lot of someone else's time debugging, and then later bit me again. Admittedly this involved a library that was wrapping some C++ code and not forcing usage through a guard as suggested above, which would have prevented the issue. But I do think it runs afoul of my expectations that code behavior should be the same regardless of optimizations. In this case that would presumably mean the shorter lifetime in both versions, as this does seem like a valuable optimization for not wasting stack space on no longer used variables in the same scope.

To the Rust topic at hand, until reading this thread, I expected let _ = foo() to keep the return value of foo() alive until the end of the scope, and the current behavior seems unintuitive. However having minimal lifetimes does seem like a good optimization opportunity (provided it does it in debug too). It would be good to expectation set that scoping is just an organizational/visibility thing and not a lifetime thing though if that's the case.

Regarding #[must_use], I think it would be nicer if there was an explicit way to indicate "ignoring result" that didn't look like a variable binding. I don't have any suggestions though.

It's horrible and don't do this, but I've written

do_fallible_operation().ok();

for this before and it does "work" for this purpose.

1 Like

(supposedly .ok() is an "accepted" way to "use" a result, I disagree and really think these should be marked as #[must_use] as they're otherwise no-op function calls, and the rest of Result's no-op functions are marked).

1 Like

14 posts were split to a new topic: Must_use and linear types

Note that #[must_bind] might not solve the problem for a destructuring assignment, e.g. let (a, b, _) = foo(), where fn foo() would be marked #[must_bind]. I was recently bitten by this: the tuple element not-bound to _ actually needed to stay alive until the end of the block, resulting in a couple of hours of debugging.

Off-topic musing:

It would be great if let (_, _) = foo(); were actually semantic sugar for let (_autogenerated_3482, _autogenerated_8904) = foo();, so that it would have the same semantics as let (_a, _b) = foo(). (This was my mental model of let _ until I encountered this behavior.)

1 Like

I think it could (where to me "solve" here means helpfully emitting a warning) if it was attached to the type itself rather than to the function.

I think using a standalone Mutex to protect network or filesystem resources is a relatively common pattern. I agree with folks who think it's a code smell -- there are probably better ways of doing things -- but it's such a ubiquitous pattern in other languages that I expect people will always want to do it in Rust too. I can't imagine a case where let _ = my_mutex.lock().unwrap(); is correct code, and it seems like it would be nice to warn for that, with very little downside.

Another good use case for this feature might be CString. A common gotcha in unsafe code is to write something like let s = CString::new(...).as_ptr();. That immediately drops the temporary CString and makes s a dangling pointer. This can be valid if you're immediately using the pointer on the same line, but I think even there it's a bug waiting to happen, when someone comes along and helpfully refactors your function call into a few separate lines. My personal rule is that CString in particular should always be bound to a local variable. If that's a common rule for other people, it might be nice to turn it into a compiler warning.

Maybe we could generalize from there to think about the category of "things that are usually created so that you can extract raw pointers from them." Another example that comes to mind is MaybeUninit. I've never run into this problem personally, but I can imagine a similar issue with someone writing MaybeUninit::uninit().as_ptr().

5 Likes

I think 99.99999% of users expect _ to just mean "I'm not bothering to name this thing" even if there is technically documentation for this behavior. It's a violation of principle of least surprise for it to change object lifetime; I'd expect the semantics to be the same as if a random unique name replaced the _. Nobody coming from C++ for example is going to expect a "change in variable name" (not what it is but what most people will think of it as) to cause changes in runtime behavior.

5 Likes

If we imagine _ (wild pattern) like a black hole, it all makes sense as there are no way to get back the value after binding with _.

So I think a clippy lint warns when bind a Drop-able value to _ is nice.

1 Like

The problem with "everyone expects _ to prolong an object's lifetime" is that _'s current behavior is absolutely critical when the binding would be by-move. If _ became _random1234, then any code trying to partially match on an object in the default binding mode will break. For example: Rust Playground

Rust thus needs some syntax for _'s current behavior- it's just too useful a tool in pattern matching to give up. Further, _ is a long-established way of doing this in languages with pattern matching, so renaming it to give _ your "expected" behavior will really only shift the confusion to a different set of people, "violating the principle of least surprise" all over again!

This is why lints are a good solution- they find the places that people aren't getting the expected behavior and teach them to use the existing mechanism for the behavior they want.

16 Likes

FWIW, let __ = ... Just Works :laughing:

More seriously, the pattern in Rust is let _guard, since a value whose only purpose is to do something at the end of the scope through RAII / drop is called a guard.

Actually :thinking:, back to let __, it is currently a horrible hack due to it looking too similar to let _, but we could imagine, within a new edition, to warn against all usages of let _ = (no type annotation, specific let), suggesting to either use let __ / let _guard for the guard pattern, or to use drop to silence a #[must_use]. I do agree that _ as non-binding pattern is just straight up necessary, as @rpjohnst showcased in their example, but let _ itself, without a type annotation, as seen in this thread, is actually a code smell.

For those feelingthat special casing let _ vs. other pattern cases, feels inconsistent, it turns out that it's not really the case! Indeed, the other cases are the following:

  • function parameter, such as fn (Pattern(_): SomeArg) { /* scope */ },

  • if/while let, such as: if let Pattern(_) = ... { /* scope */ };

  • for, such as for Pattern(_) in ... { /* scope */ };

  • match, such as match ... { Pattern(_) => { /* scope */ }}

have all in common that within the /* scope */, the value matched against _ cannot possibly have been dropped!1, which is different than the let Pattern(_) = ...; /* scope */ case discussed in this thread.

1 Yes, it is non-trivial but true (Proof). The fact it is non-trivial is yet another argument w.r.t. let _ being the code smell case.

4 Likes

As you describe, if you use a temporary as the scrutinee, and you compare its drop scope to the pattern binding scope, you do get different behavior across functions/if let/for/match vs let.

However, this appears to be an intentional design choice: Temporaries are consistently dropped at the end of the statement, but only let statements end before their binding scope does. And for good reason- most expression-based languages include let's binding scope in the let expression, as in let pat = val in expr, but in Rust this is tantamount to applying temporary lifetime extension to all temporaries!

The deeper consistency in behavior here is not in whether the scrutinee has been dropped, but whether it has been bound. Binding is the way values are moved, which means _ for temporaries is just refraining from "moving" out of the temporary, and de-special-casing the order of "dropping let temporaries" with "executing let's binding scope" is just special-casing let statements' temporary drop scope to expand beyond the statement!

Thus, the better way to resolve this inconsistency might arguably have been to consider "the rest of the block" as part of the let statement, which would have let us ditch the special case of temporary lifetime extension, and always drop _/..-bound temporaries before executing the bodies of all of functions/if let/for/match/let. (This of course would further cement let _ = as perfectly idiomatic way to ignore a value :slight_smile:)

3 Likes