FFI-proof slice::from_raw_parts

slice::from_raw_parts has a few requirements that must not be violated, but when writing FFI code:

  • Adding if !ptr.is_null() before every slice::from_raw_parts gets tedious, and is easily forgotten. The C side of the FFI may be promising the pointer is never null, but Rust wouldn't exist if C was good at guaranteeing such things.
  • Approximately nobody checks for isize::MAX length or the pointer alignment.
  • In C, some libraries could declare that if the length is 0, then the pointer doesn't matter and can be any garbage, because it will never be dereferenced, but slice::from_raw_parts(ptr, 0) still has some opinions about validity of the pointer [1].

I think it would be both more convenient and safer if there was a version of slice::from_raw_parts that checks basic validity of the inputs to leave fewer opportunities for UB (non-null, aligned, byte_len < isize::MAX with checked arithmetic). It might be controversial, but when len==0 it would be even safer to only check if ptr is non-null, but return &[] instead of the actual ptr value.

I hope users writing FFI won't expect miracles from the "checked" version of this function, as there's no non-hacky way to check for dangling pointers and buffer overruns that aren't obvious integer overflows. Maybe instead of having a potentially overpromising name like slice::from_raw_parts_checked, the checked function could be std::ffi::slice.

std::ffi::slice(ptr, len).ok_or(Error::Oops)?

That would make the more-convenient-to-write code also the safer option. The checks can be pretty cheap: checked_slice(ptr, len).unwrap_or(&[]) costs about 3 cmoves. The slice::from_raw_parts is always going to be there for the cases where checks are definitely not needed.

Bonus points for slice(ptr: *const T, len: impl TryInto<usize>) so that users won't be gambling with wrapping/truncating in c_len as usize across platforms.


  1. Rust needs such pointers to be at least non-null and aligned. There are exceptions for ZSTs, but I'm not sure if they apply to empty slices of non-ZST types. I don't know how provenance is supposed to work with garbage pointers either ↩︎

8 Likes

Adding a null check to from_raw_parts is approved -- someone just needs to do the work.


I'm not sure how relevant the isize::MAX limit is here, since that's also a restriction in C. Do we have some sample cases of people violating that from FFI?

One thing that would be a nice way to address it more directly (and without checking overhead in from_raw_parts) would be if we had some sort of struct slice::Len<T>(usize, PhantomData<fn(T)->T>); with at least a safety requirement that its value is at most if T::IS_ZST ? usize::MAX ? isize::MAX / sizeof(T).

That plus having it take NonNull instead *mut/*const would move everything but the "well is that allocation even there" into the type system instead, which I think is always preferable. (And nicely separates out the safety preconditions.)

1 Like

modulo perf impact.

Personally I'd like separating the C semantics from the Rust semantics, then one function won't have to please both masters.

3 Likes

I think having a checked version of these functions for FFI use is a good idea. There's little downside to having the checks. And if you do need every to save every cycle of performance then the uncheck version still exists and can be used in that specific situation.

2 Likes

That doesn't work for me, and seems undesirable for every use-case I have.

I can't use it to avoid the if !ptr.is_null() boilerplate, because typically I don't just have to prevent instant UB, but to handle errors at the application level. C APIs use null to signal failures, and even if they're supposed to signal failures via some other error code or have infallibily non-null pointers, getting a surprise null is a sign of things going wrong anyway.

If I just let the null pointers become empty slices, it will likely cause logic errors and data loss down the line. Almost always I need the explicit check to report errors and distinguish between intentionally returned empty slices and failures to return any slice.

In a few cases where I don't need to distinguish null from empty, a function returning Option<&[T]> could easily be combined with .unwrap_or_default() to get the null-means-empty behavior. OTOH a function that always converts null into len==0 needs exactly the same null check as before in the majority of cases where I need the check.

UB could be worse than getting an empty slice, but that is relevant only for code that was already lacking the necessary error handling. Such solution is more like less-lethal ammunition for a footgun than an improved API.

slice_from_raw_nonnull(NonNull::new(ptr)?, len) could get the job done, but:

  • NonNull is not just not null, it's covariant.
  • It's *mut, which is fine for the &mut slice, but not ideal for the &[]. I know it doesn't really matter to Rust, and C APIs often don't use const either, but having separate *const types can still help when writing bindings. Mutable and non-mutable pointers must be handled separately, because making aliased &mut is UB too (*const has saved me a few times when I badly copypasted between _mut and non-_mut versions of methods).
  • having to import another type and call another function is more work, so it's not in the pit of success compared to sticking with the from_raw_parts, and I don't think it would get broad adoption. Personally, I prefer having one function that takes a nullable ptr and returns an Option, so I would use my own wrapper and not std's NonNull version.

Having the check in the slice::from_raw_parts would be annoying, because I also have cases where I really don't want that check, like iterators that use a raw pointer (due to aliasing or to hold start-end instead of start-len) and create a slice for every element. It's not the end of the world – can add unreachable_unchecked hint where the check doesn't get optimized, but it's just another way this change doesn't solve any of my problems and creates new ones.

3 Likes

I'm concerned about:

  1. len == -1 due to off-by-one bugs or accidentally returning -1 used as a sentinel value in length meaning absent/uninitialized slice where 0 could be a valid length.
  2. u64 lengths mixed with 32-bit APIs. 2GB is not much these days, and even 64-bit platforms have to worry about 32-bit ints being everywhere. Integer promotion can sign-extend them to 64 bits too, so size_t lengths can't be trusted either.
  3. Unix-centric code forgetting that Windows is an LLP64 platform, and having more of these bugs in longs on Windows.
  4. Deliberate exploits of numeric overflows caused by malicious inputs (even if all code is 64-bit). Chances are these will pwn the C code anyway, but in case they don't, I'd like these to stop in Rust, not pwn Rust too.

Having a safer type for lengths would be nice, especially if it didn't allow as casts. They're doubly dangerous in FFI and when bindgen-generated bindings may have different integer sizes depending on platform or version of headers it finds.

3 Likes

If you want to guarantee that you don't get the check, you can spell it as one of

NonNull::slice_from_raw_parts(NonNull::new_unchecked(ptr), len).as_ref()

or

NonNull::slice_from_raw_parts(NonNull::new_unchecked(ptr), len).as_mut()

depending whether you want &[_] or &mut [_] respectively, all of which is available in stable.

Though as the8472 noted, the change to slice::from_raw_parts is only approved modulo the check mostly optimizing out anyway so you probably don't need to do that.


I'm certainly not opposed to an ffi::something function for this stuff, but I just overall wonder how much the checking can realistically do. It can't catch overflow like CVE-2018-1000810: std: Buffer overflow vulnerability in str::repeat() › RustSec Advisory Database, for example, and adversarial buffer sizes need not get anywhere near isize::MAX.

I know there are plenty of failure modes it can't check for, but it could check as much as a diligent-slightly-paranoid programmer can check for, while having an API that is nicer to use than the status quo with the bare-minimum !is_null check.

I don't think of prevention of deliberate attacks as the primary goal. Mainly it's about making regular error checking (that returns Result or Option) more convenient and harder to forget/cut corners on in typical cases, hopefully raising the baseline across the ecosystem. Ability to catch the less common problems like misaligned pointers and -1 lengths that authors usually don't bother to check for is a bonus. And the fact that the checks can also catch a subset of other issues like ABI mismatch or some exploits[1] is a cherry on top.


  1. attackers may not be able to overflow lengths exactly, e.g. when type confusion puts a pointer where a length was expected, or where lengths have upper limits, but the application used max-min without enforcing max>min ↩︎

1 Like

As a data point, just this week git2 (a crate that provides libgit2 bindings) fixed a null pointer dereference when using slice::from_raw_parts by adding a null check: Potential null pointer dereference with `Deref` implementation on `Buf` · Issue #1211 · rust-lang/git2-rs · GitHub

2 Likes

+1 for “{NULL, 0} is a valid zero-length ‘slice’ for a lot of C APIs and checking specially for it feels like boilerplate”

2 Likes

dav1d bindings also had to fix a missing null check after it has already shipped to production and caused UB: Don't pass NULL to `slice::from_raw_parts()` by sdroege · Pull Request #121 · rust-av/dav1d-rs · GitHub

so +1 for a safer method that just handles that by default

1 Like

On the isize::MAX check: the case that you're protecting against is exclusively that ffi code actually allocated a single allocation of over isize::MAX bytes. If the FFI code gives Rust an incorrect length, the Rust will still have instant UB from creating a reference without valid provenance to access the full slice.

Rust doesn't currently enable LLVM to exploit said UB, since the dereferencable(N)LLVM attribute can only note the static prefix (IIUC), but we reserve the ability to exploit this in the future. All current discussion on Rust's dynamic memory model (eg Stacked/Tree Borrows) presumes[1] that we retag slice and dyn Trait tails, thus requiring those bytes to be valid to access.

(I am a member of T-opsem, but this represents my own belief only; it is not intended to represent official team position.)


I think the stable null-checked version of slice::from_raw_parts would currently be spelled

ptr::slice_from_raw_parts(ptr, len).as_ref()

since pointer::as_ref is null-checked. (Although as_ref/as_mut being checked when all the other pointer methods are unchecked throws me off all the time.)

If ffi::slice is to exist, producing Some(&[]) given (null, 0) provides rationale separating it from the above spelling. But I've never actually seen a C ABI interface ever use/handle an empty nonnull slice differently than a null slice. Rather, what I've seen is more along the lines of handling length "supporting" a null-empty slice more by accident in an otherwise nonnull expecting function.


This feels like a useful helper to have to me, but the checks other than the null check feel more designed to mitigate UB than to prevent it. This isn't a bad design in resilient C-ABI APIs, but it bears repeating that any UB results in a whole-program removal of all guarantees, so this doesn't really protect you, it just might somewhat by chance maintain a little bit more behavioral sanity than without. Which again, is a useful property. Just not a strong one.


  1. It is theoretically possible to take however the model supports extern type (tail size not known to the compiler) and use that for all unsized types. But that doesn't seem desirable for the unsized types that are known to the compiler. ↩︎

2 Likes

We exploit it as soon as you pass it as a function argument, actually, as of 1.93 thanks to Add LLVM range attributes to slice length parameters by scottmcm · Pull Request #148350 · rust-lang/rust · GitHub

It's also been exploited since 1.86 if you use size_of_val: Set both `nuw` and `nsw` in slice size calculation by scottmcm · Pull Request #136575 · rust-lang/rust · GitHub

1 Like

Technically both only exploit the size limit and don't extend dereferencable (of which I was technically only meaning the dereferencable provenance there), but this is a nice improvement to see as well!

(Aside: size_of_val_raw calls that same intrinsic. It requires len <= usize::MAX so is fine, despite my temporary worry that it not requiring ptr + len to be nowrap could've been problematic with this patch.)