Assume slice length is smaller equal `isize::MAX`

DzenanJupic · March 20, 2024, 8:40am

Currently, the slice len method just reads the pointer metadata and returns it:

pub const fn len(&self) -> usize {
    ptr::metadata(self)
}

My understanding is, that the length of slices is limited to at most isize::MAX. Slices longer than that would be unsound. Therefore, i.e. adding up the length of two different slices or adding a small number should never overflow/wrap.

Still, when looking at the compiler output for such operations, this guarantee seems to not be picked up: [godbolt]

pub fn add_one<T>(slice: &[T]) -> usize {
    match slice.len().checked_add(1) {
        Some(n) => n,
        None => unreachable!()
    }
}

pub fn add_one_assume<T>(slice: &[T]) -> usize {
    unsafe { assert_unchecked(slice.len() <= isize::MAX as usize) };
    match slice.len().checked_add(1) {
        Some(n) => n,
        None => unreachable!()
    }
}

pub const unsafe fn assert_unchecked(expr: bool) {
    if !expr {
        std::hint::unreachable_unchecked();
    }
}

Here, the second function generates panic-free code, as expected, while the first function does not. In practice, this has negative performance implications for i.e. array-vectors, since for a function like copy_from_slice the compiler cannot ~~guarantee~~ proof that self.len() + slice.len() >= self.len() which leads to unnecessary panicking branches.

A really simple solution would be to just add an assert_unchecked statement to the slice len function. Would a PR for that be accepted?

steffahn · March 20, 2024, 9:58am

As one detail, only slices with T not zero-sized have guaranteed len <= isize::MAX as usize. The guarantee is more precisely that len * size_of::<T>() <= isize::MAX as usize, so that's an even stronger guarantee in case that size_of::<T>() > 1. (I'm not sure if this stronger guarantee needs to be re-written into something of a len <= calculated_max_len::<T>() form for LLVM to make full use of it.)

the8472 · March 20, 2024, 10:10am

I have adding validity ranges to the &[T] metadata on my todo list but it requires changes to the compiler. Adding assumes in the library won't get us the niches but should be easier to implement.

scottmcm · March 20, 2024, 3:15pm

Lang conversation about what the actual validity rule for reference-to-slice should be:

github.com/rust-lang/rust

Elaborate on the invariants for references-to-slices

rust-lang:master ← scottmcm:slice-validity

opened 07:08AM - 04 Mar 24 UTC

scottmcm

+11 -1

The length limit on slices is clearly a safety invariant, and I'd like it to als…o be a validity invariant. With [function parameter metadata](https://discourse.llvm.org/t/rfc-metadata-attachments-for-function-arguments/76420?u=scottmcm) making progress in LLVM, I'd really like to be able to use it when `&[_]` is passed as a scalar pair, in particular. The documentation for references is cagey about what exactly is a validity invariant, so for now just elaborate on the consequences of the existing safety rules on slices -- the length restriction follows from the `size_of_val` restriction -- as a way to help discourage people from trying to violate them. I also made the existing warning stronger, since I'm fairly sure it's already UB to violate at least the "references must be non-null" rule, rather than it just being that it "might be UB in the future". cc @joshlf @RalfJung

afetisov · March 20, 2024, 4:31pm

Is it actually always true? That's true for 64-bit systems, probably for 32-bit ones. But what about 16-bit, or any other possible bitness? I would assume that such users would not be happy to slash half of the possible address space.

cuviper · March 20, 2024, 4:50pm

They can use the full address space, but not in a single object. Pointer offsets are signed, all the way down to LLVM getelementptr, so must not exceed isize::MAX.

scottmcm · March 20, 2024, 5:04pm

So long as we're using LLVM and interoping with C, it'll be the rule regardless of pointer size. So we've just adopted it into Rust as a fundamental restriction on objects.

system · June 18, 2024, 5:04pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Proposal: Get Range of sub-slice libs	28	2151	August 3, 2022
Slice ending bounds past the end of the slicee	3	1260	March 25, 2019
Slice.take function libs	17	876	June 24, 2019
Make Vec::set_len enforce the len <= cap invariant	25	2000	April 14, 2019
Ergonomics of raw-pointer slices language design	7	1365	December 10, 2022

Assume slice length is smaller equal `isize::MAX`

Related topics