I think the name `Path::display()` should be replaced with `Path::display_lossy()`

I just reviewed this line:

let path_str = path.as_ref().map(|p| p.display().to_string());

I told the author that I think p.to_str()? should be used instead of display().to_string(), because it is better to error here with "non-unicode paths are not supported" than later with "file "fo�.txt" not found".

It happens that people learn that to print a Path, they must use display(). If they get used to it and they are clever, they figure they can use display().to_string() to convert the Path to String, and then the intricacies of the difference between to_str() and to_string_lossy() are lost.

I think the name display() should be deprecated, and the name display_lossy() should be used instead, to make the issue more evident. What do you think?

4 Likes

I think the use case goes clearly against reasonable practice anyways, regardless of the name. The whole pattern of having a separate display method here does, as far as I'm aware, exists precisely only in order to prevent path from having a lossy to_string() method itself, that doesn't call out the lossyness of the conversion. If lossless conversion to a string was possible, Path would just implement Display directly instead.

You might be describing a real problem of teaching this wrong though. Maybe documentation improvements could be made, too. Users should not blindly learn "to print a Path, they must use display()". That's just a bad thing to learn, analogous to things like blindly throwing unwrap at option or result, blindly silencing warnings, or to blindly throw unsafe at code that doesn't pass the borrow checker.

They should learn "paths are not necessarily valid strings and thus don't support Display and ToString, but for use-cases of displaying a path in order to show it to a user, it offers a display method".

Now that I've written this, I'm thinking that display_lossy could be a reasonable name for the method, too, for sure, and it would help make abundantly clear that it's existence relates closely to its lossy nature. It's a bit redundant to put that into the name, as explained above, because if it perfectly represented the value as a string, the display method absolutely would not need exist in the first place, but nonetheless I don't think that redundancy would have to be a bad thing.

However, we'll need to consider the status quo and cost of change, too. Is it really that commonly misused? How far can we get with documentation alone? Should there perhaps be a clippy lint detecting use like .display().to_string()? With all such improvements made will doing the re-name still be worth the migration costs? How does it compare to other instances (if any even exist) of using deprecation in std merely to rename something?

8 Likes

If we could go back in time, would we do things differently? Maybe. But to consider the actual proposal on the table:

  • Deprecation of a very commonly used method like Path::display, and one where most usages are likely just fine, is an enormous cost.
  • Enormous costs require enormous justification for asking everyone to pay it.
  • It is likely that using path.display().to_string() incorrectly is probably not common enough to be a solid justification. And even when it is used incorrectly, it doesn't immediately result in a problem. It's only a problem when there's a path that contains invalid UTF-8.

So I'd say as a libs-api member, I'm rather skeptical of this suggestion. I also agree that a Clippy lint would be a better first step here.

12 Likes

Seconded that a clippy lint for path.display().to_string() suggesting path.to_string_lossy().into_owned() instead, probably as part of the clippy::suspicious lint group, which is for code patterns that aren't necessarily wrong per say, but do betray a likelihood of not matching intent (e.g. 'a'..'z'). Please consider opening a lint request issue, or even try your hand at implementing it if you're interested. (I hear it's not all that difficult to pattern match code structure to lint this!)

Also seconded that the method doesn't really merit a name change. Having the extra .display() should be enough to prompt the reader to ask “why is the extra adapter necessary?”, &Path is (or should be) universally known to be not just &str, and the Display trait now has a documentation section explaining the concept of display adapters. (I'm glad I eventually got time to contribute that.)

4 Likes

For me, this weighs the scales towards making the change, but probably not enough to tip the overall balance in favor. If erroneous code will usually work “correctly,” then it’s actually more important for the API design to call attention to the misuse to compensate for the reduced likelihood of the problem being detected by tests.

Can clippy in addition detect usages such as

println!("cargo:rerun-if-changed={}", path.display());
1 Like

Unfortunately the stringly-typed Cargo interface can't do any better. It parses cargo: directives as &str, so it will fail if you manage to correctly specify a non-UTF-8 path.

1 Like

The point I'm making is to consider the actual impact of it being not fully correct in balance with the cost of making the change. If the actual impact of it being incorrect is, for example, "you'll get UB" in safe Rust code, then that would be extraordinarily bad and would weigh heavily. But in this case, the actual impact is likely limited to a confusing error message provided to an end user.

3 Likes

I'd like us to have a cargo API for build scripts. Thinking it might be worth using camino to help communicate these needs.

1 Like

If I was going to put my design hat on for this, after "meh" I think a ToStringLossy as part of a "stuff to show the user" vs "stuff to send to another program" design might be interesting.

But that sounds more like a crate than a std thing.