Use of #[must_use] in the standard library

I recently answered a Stack Overflow question where the poster wrote this bit of erroneous code:

let mut enter = String::new();
stdin().read_line(&mut enter).expect(r#"invalid key"#);
enter.trim().to_lowercase();

They expected it to convert enter to lowercase and couldn't figure out why their program didn't work. When I saw this I thought they must have ignored a warning, but no, it turns out to_lowercase has neither a #[must_use] attribute nor an associated Clippy warning.

This led me down a rabbit hole investigating the standard library's usage of the #[must_use] attribute. Overall, I'd say the usage is inconsistent. It's used a lot but also omitted a lot. The guidelines on when to add #[must_use] in the standard library don't set a clear policy on whether it should be used a lot or a little.

What do you think: Should it be used liberally on any function that has no use except for returning a value? Or should it be reserved only for error-prone functions that novices are likely to misuse?

As I see it, it could be helpful on any function that (a) has no mutable args; (b) has a non-unit return type; and (c) has no side effects. Side effects include things like mutating statics, spawning threads, and asserting preconditions.

These are essentially the rules the must_use_candidate lint uses, so... I ran Clippy on the standard library. It turned up nearly 800 matches. Yowza!

Adding that many annotations would be a lot of churn so I wanted to poll the community and see what you all thought should be done (if anything). I pinged some library team folks and they were on board with discussing this here. I'm willing to get into the weeds and work on PRs if there is general consensus.

Also, it's important to note that while it might be annoying to be on the receiving end of new #[must_use] warnings, adding them wouldn't break any stability guarantees.

To help guide the discussion I've extracted a set of examples and grouped them into rough categories.

Transformer "mimics"

Like the monsters that pose as treasure chests in Dark Souls and eat you when you get near, these are newbie traps. They sound like they might mutate their input argument but actually return a new value.

str::to_uppercase(self) -> String;
str::to_lowercase(self) -> String;
str::trim_left(&self) -> &str;
str::trim_right(&self) -> &str;
<[u8]>::to_ascii_uppercase(&self) -> Vec<u8>;
<[u8]>::to_ascii_lowercase(&self) -> Vec<u8>;
i32::swap_bytes(self) -> i32;
i32::rotate_left(self, n) -> i32;
i32::to_be(self) -> i32;
i32::to_le(self) -> i32;

Non-trap transformers

These methods also transform their input, but unlike the "mimics" above they don't trick anybody into thinking they might do it in place.

Option::copied(self) -> Option<T>;
Option::cloned(self) -> Option<T>;
Rc::into_raw(self) -> *const T;
Box::<str>::into_string(self) -> String;
String::into_boxed_str(self) -> Box<str>;

Pure functions

Pure functions have clear inputs and outputs. There's no point in calling them if you're going to ignore the result.

f32::is_nan(self) -> bool;
i32::is_negative(self) -> bool;
f32::sin_cos(self) -> (f32, f32);
slice::is_ascii(self) -> bool;
char::is_control(self) -> bool;
memchr(u8, &[u8]) -> Option<usize>;
memrchr(u8, &[u8]) -> Option<usize>;

Getters

Like the pure functions, why call a getter if you don't check the result? I can't think of a good reason.

into_keys is interesting because it does have the side effect of consuming the map. One could use it to drop the map. But that's a crazy use case, no? They ought to be calling drop.

str::len(&self) -> usize;
str::is_empty(&self) -> bool;
BTreeMap::keys(&self) -> Keys<K, V>;
BTreeMap::into_keys(self) -> Keys<K, V>;
Utf8Error::valid_up_to(&self) -> usize;
Formatter::width(&self) -> Option<usize>;

Used for panics?

It's plausible someone might write a[i] to cause a panic if the index is out of bounds. Is that a legitimate use case or should they get a must use warning?

Index::index(&self, Idx) -> &Self::Output;

Allocating constructors

It's wasteful to allocate a new object just to throw it away.

Vec::with_capacity(usize) -> Vec<T>;
Arc::downgrade(&Arc<T>) -> Weak<T>;
String::from_utf16_lossy(&[u16]) -> String;
Backtrace::capture() -> Backtrace;
mpsc::channel() -> (Sender<T>, Receiver<T>);

Non-allocating constructors

These constructors are cheap. If you don't save the return values it doesn't really cost you much. It's still wasteful, but only minorly so.

Vec::new() -> Vec<T>;
iter:empty() -> Empty<T>;
mem::zeroed() -> T;
str::from_utf_unchecked(&[u8]) -> &str;
MaybeUninit::uninit() -> MaybeUninit<T>;
Instant::now() -> Instant;
30 Likes

I think every case you listed should get a #[must_use]. All of them seem like likely mistakes, even if they do different amounts of harm or wasted work.

23 Likes

In my opinion the strongest case is for methods that return Self. If a different type is returned (such as str::len(&self), the signature would prevent many misuses. However, it does seem like nearly every pure method could be #[must_use], as why else would you call it?

1 Like

In my opinion, the only ones here that I think need #[must_use] are the methods that are confusable for in-place mutation.

Is it remotely common that people call pure methods (not confusable with in-place mutators) and don't use the return value by accident?

5 Likes

There's naming conventions for functions that transform stuff. All of these functions starting with as_/to_/into_ return a result. Implementations of the Into trait don't have must_use either.

Im not sure if this is a good argument against adding must_use to those, but at least they are a distinct category IMO.


btw, note that str::trim_(left|right) are deprecated and their suggested replacements do feature must_use.

Make pull requests! Just split the problem into more reviewable chunks.

16 Likes

For me, the best justification for an explicit must_use (as opposed to some heuristic lint) is when there's something specific that the message could say.

When it's just "this function is pure", we ought to have a way to detect that in most cases.

But for things like to_ascii_uppercase(), it seems very reasonable to have an explicit annotation on the method because that annotation can mention make_ascii_uppercase(), which is more useful to the user than a generic "use the return value" message.

19 Likes

It would be nice to have the compiler be smart enough to not require us to manually put #[must_use] on pure functions, but until that's the case* I'd definitely appreciate liberally adding #[must_use].

*I'd be (pleasantly) surprised if that ever changes.

11 Likes

In progress:

3 Likes

That's not really possible. There are edge cases like eg a function that is pure in release mode but might panic or have a side effect in debug mode. I wouldn't want a warning then.

1 Like

Any warning can have false positives, those are typically covered by an

#[allow(should_use_must_use)]
2 Likes

It's entirely possible to make #[must_use] the default, but it probably needs a few things:

  • An attribute like #[discardable] that means the inverse
  • An allow-by-default lint like unused_result but that gets switched off for #[discardable], for folks to opt in to
  • An edition change for making this the default

Swift did exactly this, requiring @discardableResult annotations to disable a default lint that was enabled in Swift 3.

Given the number of changes in #89692, perhaps we should consider it!

9 Likes

Note:

It's probably true that majority of functions can be #[must_use]. However, that alone doesn't imply that #[must_use] is a better default -- the cost of mistake needs to be taken into account.

In particular, cost of missing #[must_use] is low -- the attribute is actually important only for a small fraction of functions, and it's easy to add it manually for them. For majority of the functions which are not confusable, utility of #[must_use] is low.

On the other hand, cost of missing #[discardable] is high. Today we derive a lot of value from almost all warning being true warnings, actionable and useful. Making #[must_use] default, we'll add an unbounded number of false warnings (new written code which forgets about #[discardable] adds false warnings). As a maximally pessimistic scenario, error handling today relies on #[must_use]; let _ = is rare and stands out, and allows you to quickly notice important places where errors are silent. In the #[discardable] world, let _ = might become frequent enough to not make it jump out at you, reducing error handling reliability.

To clarify, I am not arguing one way or another, I don't know how to weigh the tradeoff. I do want to point out that it is not which one is more common, but which one is more common * cost of wrong classification.

22 Likes

This times a million.

I think Swift's experience shows that it's a good default. Truly irrelevant results are rare. False positives will be visible, so they can be fixed. So #[discardable] is possible to get 100% right.

Keep in mind that lacking #[must_use] is not harmless. It causes users to miss a useful diagnostic. This can create silent bugs.

4 Likes

In terms of syntax I'd try something more generic for mass-setting default arguments, e.g.


#[automatically_add_attr_to_every_fn(must_use)]
impl Foo {
   fn must_use_this() -> bool { true }

   #[except_unset_here(must_use)]
   fn discardable() -> i32 { 1 }
}
1 Like

I think the tension on "too much must_use" comes down to warning fatigue and that this is actually a three-state (if not more) problem:

  • using the function for the side effects and ignoring the return value is a bug waiting to happen (e.g. functions returning Result<(), _>)
  • using the function for the side effects and ignoring the return value is a waste (e.g. pure functions)
    • special case: confusable with mutating/eager version; raises importance
  • using the function for the side effects and ignoring the return value is intended

Warning "wasteful" in the same group as "definite issue" runs the risk of making the risk of making the latter seem like the former.

2 Likes

Very much this. For things like Iterator::map and .to_ascii_lowercase() where it's "oops, I called the wrong thing" I'd be tempted to make it deny-by-default.

Whereas for things like an excess Vec::len() that happens to not be used for a bit I'd probably treat that more like the unused functions lints and often just have it off while writing the code.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.