Rename file without overriding existing target

I just noticed that std::fs::rename may override the target if it exists. This makes sense for many applications, but for other uses cases the opposite is strongly required. The problem is that actually implementing a non-overwriting behavior is pretty hard to get right and easy to get wrong.

  • The naive approach is to have if !target.exists() { rename() }, but this has a TOCTTOU bug, where a race hazard may cause the target file to be overwritten regardless
  • One might want to create an empty file at the target location (which has atomic existence checks) which can then safely be overwritten during the rename. But a quick check with the documentation will show that this approach cannot be implemented correctly either
  • The only correct solution so far is to manually make the corresponding native calls, which needs to be implemented for each platform.

I'd be really happy to see something like rename_noreplace in the standard library. It is implemented for all major platforms, has a lot of valid use cases and would be tricky to manually implement otherwise.

7 Likes

How would this be implemented on Linux? rename always overwrites, so the only way I see is using link followed by unlink on the original path. This has the same sort of issue as you listed: this unlink may well unlink something else, violating the expectation that renaming be atomic.

With root privileges, you could try to set the immutable flag and if that succeeds then use [rename] (since it might be that the new name was unlinked between opening the fd and performing the ioctl), which will now fail if the file is still there.

My point is that this is a sufficiently hackish business to justify the absence of the function you desire — if my Linux knowledge is up to date.

You to call renameat2 with the RENAME_NOREPLACE flag. This exists since Linux 3.15 / glibc 2.38

But actually I'm seeing now that this is a Linux-specific option, so no idea about other UNIX platforms.

4 Likes

On macOS, you can call renameatx_np with RENAME_EXCL flag.

2 Likes

On Windows, it seems not overwriting is the default behavior, and you need to call MoveFileEx with MOVEFILE_REPLACE_EXISTING flag to overwrite.

Overall, it seems very reasonable to provide this in the standard library. It also could be created as a crate.

4 Likes

What is the process to get this into the standard library? I'd prefer that over a separate crate

Getting it into a crate can show that your impl works and can be a proving ground for the API (not that this particularly has a lot of API surface to quibble over).

1 Like

The less common unixes may be a problem, e.g. freebsd has no renameat2

Making a crate just to get it into the stdlib honestly sound like an unnecessary detour, at least for such a simple and straightforward API (literally a single function addition).

Are there any precedents/examples for handling functionality that is supported on most major platforms but not all? I know that for structs there are platform specific extension traits, but what for standalone methods?

This is an addition of a new API to the standard library, so this section of std-dev-guide applies: API Change Proposals - Standard library developers Guide.

3 Likes

Do mv -i implementations avoid TOCTTOU somehow, or is the flag best effort only?

At least for OpenBSD it is checked once to know if a question should be asked, but it doesn't affect the actual rename in any way, so there is a race conditition. src/mv.c at 1b4a394fde7ee125bb7af82662ad3ea96d51265f · openbsd/src · GitHub

3 Likes

The GNU coreutils call renameatu with RENAME_NOREPLACE, while renameatu uses renameat2 or renameatx_np if available: gnulib/renameatu.c at master · digitalocean/gnulib · GitHub

I'm not 100% sure, but it looks like it falls back to a non-atomic emulation of the desired behavior if none of the syscalls are available.

3 Likes

Frustrated by the same problem and inspired by the comments in this thread, I published a crate! In the process of developing this crate, I realised that this sort of thing probably doesn't belong in the standard library. It's not supported everywhere so you need to check for support before using it. Although it is supported by 5-year-old versions of the three most common operating systems so that's probably good enough for most people.

5 Likes

I'll note that the Linux support is not just "3.15". The syscall has been around since 2008 and in glibc 2.10 it seems, so it'll be there in anything that works. However, the filesystem support is spread out across time:

    ext4 (Linux 3.15);
    btrfs, tmpfs, and cifs (Linux 3.17);
    xfs (Linux 4.0);
    Support for many other filesystems was added in Linux 4.9, including ext2, minix, reiserfs, jfs, vfat, and bpf.

It really feels to me like one of those "better to ask forgiveness than permission" and to try the call and fallback to less robust strategies if and when that fails (with something like -ENOSYS, -EOPNOTSUPP, -EINVAL, etc.).

3 Likes

This seems like a better way to go. It makes the API a bit less clunky. Version 0.2.0 has been published with this change.

This mess is why the C++ filesystem API declares that it has UB if two programs interleave access and modification to the same, er, filesystem objects.

This would be a massive footgun for security-critical applications. If atomic renaming was requested and is impossible just return an appropriate io::Error.

2 Likes

Yes, I should have been clearer that the caller of the "move without replace" API is the one that gets to decide the fallback strategy (mechanism over policy).

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.