[Pre-RFC] A tiny internal change to `reverse`, `copy_from_slice`, `swap_with_slice`

tl;dr: I propose to rewrite, for example, swap_with_slice from this current version (source)

// Before (minimized)
pub const fn swap_with_slice(&mut self, other: &mut [T]) {
    unsafe {
        ptr::swap_nonoverlapping(self.as_mut_ptr(), other.as_mut_ptr(), self.len());
    }
}

to lift self.len() to the front as follows:

// After (minimized)
pub const fn swap_with_slice(&mut self, other: &mut [T]) {
    let len = self.len();
    unsafe {
        ptr::swap_nonoverlapping(self.as_mut_ptr(), other.as_mut_ptr(), len);
    }
}

The same idea applies to the implementation of reverse and copy_from_slice.

Are there any downsides to such a rewrite?


Motivation

I'm working on formally verifying slice functions in core using Creusot (see also the Verify Rust std challenge).

Part of the approach is to annotate functions with "ghost code" to enable formal reasoning using our tool Creusot with no impact to run-time. The goal being to verify fragments of core, there is an automated "erasure check" that if you remove this ghost code, the remaining code must match the original code in core. This check works for most functions, except the three in the title.

In the swap_with_slice example above, the method as_mut_ptr returns a raw pointer, and we extend its result with some "ghost data" that lets us record what the pointer is pointing to. Unfortunately for us, this "ghost data" carries the same lifetime as the input borrow, whereas previously as_mut_ptr only returned a raw pointer so there were no lifetimes in the result. Because of this, self is no longer available to call self.len().

Hence our current solution is to rewrite this function to get self.len() first, and I'm wondering if it would make sense to upstream this single change to core, instead of extending our "erasure check" with an ad hoc rule for this case. (Our verified version of swap_with_slice for the curious.)

Indeed, even outside of formal verification, one could argue that self.as_mut_ptr() "morally borrows" from self, so calling self.len() while the raw pointer self.as_mut_ptr() is still in use is kinda smelly.

I'm curious to hear people's thoughts on this.

The core thing to me with this is that if you did this, I expect someone will change it back to the other way since it's "obviously" unnecessary.

What's your plan to have a test for it to keep it from regressing?


But at a broader level, note that .len() is special. We do things like rust/compiler/rustc_mir_transform/src/lower_slice_len.rs at main · rust-lang/rust · GitHub that special-case it already. It's special in the ABI since .len() is actually just looking at a function parameter, etc.

So I think you should just support this. Could probably even enable that particular mir pass if you need it.

1 Like

What's your plan to have a test for it to keep it from regressing?

I think a comment explaining why those functions would be written that way should be enough to dissuade Rust maintainers from rewriting it without a stronger reason.

But at a broader level, note that .len() is special. (...)

As far as I understand, these special cases are motivated by performance, or are there any that affect the functional behavior of .len()?

I can recognize that what I'm proposing is a hard sell. Even if my proposal doesn't win, what I'm hoping to get out of this discussion is also a clearer understanding of what makes self.len() special in this context, if it is indeed special.

I don't know if I'd say this is a good way to think of it, but a slice reference is kind of a pointer/metadata pair (where the metadata is the length), so as_mut_ptr could be seen as exclusively borrowing the pointer part but not the metadata part, similar to what you can do with separate fields of tuples (or structs). It wouldn't have to only be an optimization thing. (The special bit would then be that you're doing it using methods instead of direct field access, but at least that's instantaneous special behavior rather than ongoing, and something at least some people would like to support for arbitrary methods some day.)