For a CLI program that uses its arguments directly, like grep or find, I wouldnât bother with any cleaning or canonicalisation at all. If the user requests an operation on âfooâ, itâs easiest for them to understand results and responses if theyâre phrased in identical terms.
As for the directory-tree-as-data-structure case, I came up with my âmonotonicâ algorithm while I was working on a crate for local http caching, so itâs definitely the way I want such a program to work.
That example cuts both ways, though: imagine a symlink pointing to the wrong place, and you get an error message saying âcould not read foo/bar/baz: not foundâ when foo/bar/baz definitely exists and its last-modified time is much older than the error message.
Yeah, a fully-canonicalized path is useful when you care about file contents but less so when you care about the directory structure itself. That said, the âmonotonicâ algorithm never tries to resolve a symlink at the end of a path (because the end of a path can never be followed by .. or it wouldnât be the end of the path) so Iâm not worried about that.
âunsafeâ usually means exclusively memory-unsafety, so maybe not those particular names. This is more the âvalidatedâ versus âunvalidatedâ kind of safety, like SQL injection and cross-site-scripting, and (like those problems) the real solution is separate data-types that canât be easily mixed. I think this thread is about extending the (single) PathBuf type rather than designing a new, safer path manipulation API, so I donât think âsafe by defaultâ is a practical goal here.
On the other hand, providing the tools to build a safe, higher-level API seems reasonable. How would you feel about a is_relative_descendant(&self) -> bool method that returns true for a path that does not start with a prefix or a root component, and does not escape its prefix with .. components? It would fit nicely with is_absolute() and is_relative(), since on Windows, paths like \foo and C:foo are relative, but not relative descendants.
You might look at Pythonâs pathlib API, which has PurePosixPath and PureWindowsPath types, which do not touch the filesystem and can therefore be used on any platform, and PosixPath and WindowsPath types which do touch the filesystem and therefore can only be used on their respective platforms.