Path trailing separator inconsistency

If you push the empty string to a PathBuf it adds a trailing separator. However, all other path methods essentially ignore this trailing separator.

Example program:

use std::path::PathBuf;

fn main() {
    let mut p = PathBuf::new();
    p.push("directory");
    println!("{}", p.display());
    
    p.push("");
    println!("{}", p.display());
    
    println!("{:?}", p.components().last());
}

I think components should return the empty string as the last component in this case. This matters because OSes often do distinguish between paths with and without a trailing slash whereas Rust treats them as equivalent.

To my mind, Path is intended to be a list of components so this can be justified in those terms. However, if this change is too breaking then I think it would be good to at least add a method to detect the trailing slash.

3 Likes

I imagine you could just submit a PR for an ends_with_separator method on Path.

Note that the stdlib "deal with this path semantically" implementation has some quirks and inconsistencies. If you care about this level of detail, you may run into them.

Yeah, at the moment I'm interested in how Rust Path interacts with the OS's path. It's a bit of a problem that, according to Rust, trailing slashes are entirely invisible but to the OS they can have meaning. A ends_with_separator function would help but it also feels a bit like bolting on a workaround.

Do you have an example for this? I can only think of userspace tools which care (e.g., rsync).

It is a workaround. Since you care on that level, you'll likely need a lot of workarounds. (Well, what you really want is a well-designed crate, but sadly I don't know one to recommend.) E.g., . is also invisible, unless it's at the start of your path. And a path ending with .. has a file_name of None.

Rant about extensions

There's no method to add an extension, but you can replace (set) the extension. Except if the path has no extension, then setting the extension adds an extension. It's okay if your "extension" includes path separators.

If you ask what the extension is of a file that ends in ., it will say it's "". But if you replace an extension (empty or not) with "", this really means remove the extension, including the preceding .. There is no remove extension method; this is apparently the intended -- but undocumented! -- way to do so.

To actually make a file end in ., then, you clearly can't use set_extension(""). Instead, either set the extension to something that ends in . directly, or first set the extension to . (so it ends with ..) and then set the extension to "" (to remove one of the .).

Probably this was all inspired by systems where you have 1 or 0 extensions and never a trailing ..

Anyway, yeah. This part of stdlib could really use some love IMO.

It's part of POSIX, and comes up a lot around symlinks especially (act on the symlink or act on the directory it's pointing to).

5 Likes

I don't know of any OSes that enforce this. Sure, in windows, Linux and MacOS most files have 0 or 1 extension. But on all 3 .tar.gz files can be accessed and even created just fine. And other OSes only matter insofar Rust can actually target them, which aren't too many.

Given all this, it wouldn't and doesn't make sense for Rust to not take that into account. Let's face it, in retrospect the Path and PathBuf types have some foot guns that shouldn't have been there. Of course due to stability guarantees, all of stdlib and pretty much any other crate is stuck with it, at least until someone designs a crate better suited to the task AND convinces the ecosystem to start favoring that over Path/PathBuf. Not exactly an easy task.

1 Like

Maybe I should have said "mindset" and not "system"... really it's just the only thing I've thought of to explain pretty much all of the surprising-to-me behavior. Set [replace] the final extension but no explicit add or remove extension; set to "" is remove extension and/because trailing . doesn't count; set with no current extension is add extension... these all make perfect sense for a none-or-one-nonempty extension environment. (But not for any system I've used since FAT.)

Anyway, whatever the historical reasons, I think we're in agreement. Hopefully a lib subteam takes it up at some point (or someone makes an awesome crate).

2 Likes

Ah, right…symlinks. Always a fun topic in the path handling world :slight_smile: . Thanks for the link.

2 Likes

I noticed this thread and thought I should mention normpath. I created it, and it should be able to resolve some of these issues. In particular, BasePathBuf::push doesn't add a separator for empty paths.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.