Allow std::fs::DirEntry to vend borrows of its filename

At present, std::fs::DirEntry::file_name(&self) returns an owned OsString, which is safely cross-platform. But on Unix (at least), this involves an unnecessary allocation, since the underlying dirent C struct contains the filename, so we could just return a reference (which is part of the current Unix implementation of file_name: OsStr::from_bytes(self.name_bytes()).to_os_string()).

I've implemented a variant that works the same as that, minus the call to to_os_string(), and it works in trivial tests on my machines (x64 and arm64 Macs). I'm pretty new to Rust, though, so I might be missing something.

I'm unclear on how this would/could work on Windows, but even in the worst case, this new API could either be just part of the Unix extensions to DirEntry (putting the responsibility on the caller to know when this "optimized" API is available), or the cross-platform façade of DirEntry could have a new function that returns a std::borrow::Cow<OsStr>, which on Unix would always be a borrow.

The most obvious usecase for this feature is sorting DirEntrys (since the order is platform/filesystem-dependent), but any situation where an unnecessary allocation can be avoided is a win in my book.

I'd love to hear some thoughts on this. Like I said, I'm pretty new to Rust, so I may be missing something huge.

There was a previous conversation on this topic, but it trailed off a bit from the main point, I think:

1 Like

Having this as an OS extension would make sense.

2 Likes

The Windows DirEntry holds (a data structure which holds) the native encoding of the Windows filename, which is UCS2, while an OsString is WTF8. Converting requires an allocation. If the DirEntry implementation changed, it would need to allocate every readdir, which isn't an improvement. And OsString can't easily change because of methods like OsStr::to_str.

It could be a platform-specific extension for Unix. And it might be possible to get a &[u16] for Windows.

Oh Windows…

Yeah, I'm definitely not advocating to change OsString or OsStr. Can I ask why the idea of vending a Cow from the platform-neutral API seems to have not been as compelling as I was thinking? That seemed, to me, to be a good solution to the problem. Am I misunderstanding why/when to use Cow?

In any case, is this something that would make sense to try to PR into the standard library? I've read a bit of the contribution guide, and it seems fairly doable.

Adding it as a new method on the std::os::unix::fs::DirEntryExt trait sounds reasonable to me!

6 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.