Getting raw unchecked access to a buffer is an important use case that Rust, as a systems language, needs to be able to provide. Obviously, it should be discouraged, especially when a developer is prototyping out an implementation. However, it still needs to be something that can be done when performance matters.
Currently the way we provide raw access is sporadically offered via some unsafe_*
methods on slices, and by some utilities provided by the ptr
module. Everything you need is there, but it’s confusing and often cumbersome to do. Making unsafe code complicated and hard to write is a recipe for disaster. This is code that needs to have the most care taken while written.
To alleviate these issues, this RFC proposes to introduce raw slice types in the form of *const[T]
and *mut[T]
. These types would consist of a rawptr, and a length. They would expose the following APIs:
pub Trait RawSlice<T> {
/// Converts the rawslice into a slice
unsafe fn as_slice(&self) -> &[T];
/// Converts the rawslice into a rawptr
fn as_ptr(&self) -> *const T;
/// Reads the data at the given index and interprets it as a value of T.
/// This does not move the value out.
unsafe fn read(&self, index: uint) -> T;
}
pub Trait RawMutSlice<T> : RawSlice<T> {
//// Converts the rawslice into a mutable slice
unsafe fn as_mut_slice(&self) -> &mut[T];
/// Converts the rawslice into a mutable rawptr
fn as_mut_ptr(&self) -> *mut T;
/// Writes a value to the given index without reading or destroying whatever
/// data might exist at that index. Appropriate for initializing unitialized
/// data.
unsafe fn write(&mut self, index: uint, val: T);
/// Reads the data at the given index and interprets it as a value of T.
/// Then zeros out the memory without calling any destructors.
unsafe fn read_and_zero(&mut self, index: uint) -> T;
/// Swaps the value at the given index.
unsafe fn replace(&mut self, index: uint, val: T) -> T;
/// Copies the contents of the given rawslice into this one, assuming that they
/// might have overlapping regions of memory.
unsafe fn copy(&mut self, from: *const[T]);
/// Copies the contents of the given rawslice into this one,
/// assuming they don't have any overlapping memory.
unsafe fn copy_nonoverlapping(&mut self, from: *const[T]);
}
These types would also provide unchecked indexing and subslicing via the []
operator.
In addition, existing extension traits would be modified accordingly:
pub trait ImmutableSlice<'a, T> {
...
/// Converts the slice into a rawslice for unsafe manipulation
fn as_raw(&self) -> *const[T];
}
pub trait MutableSlice<'a, T> {
...
/// Converts the mutable slice into a mutable rawslice for unsafe manipulation
fn as_mut_raw(self) -> &mut[T];
}
pub trait RawPtr<T> {
...
/// Converts the ptr into a rawslice with the given length
fn as_raw_slice(&self, len: uint) -> *const[T];
}
pub trait RawMutPtr<'a, T> {
...
/// Converts the mutable ptr into a mutable rawslice with the given length
fn as_mut_raw_slice(&self, len: uint) -> &mut[T];
}
All unsafe_*
methods would then be removed from regular slices, in favour of coercing them to raw slices first.
Case study: Porting insertion sort from rawptrs to rawslices
Old code:
fn insertion_sort<T>(v: &mut [T], compare: |&T, &T| -> Ordering) {
let len = v.len() as int;
let buf_v = v.as_mut_ptr();
// 1 <= i < len;
for i in range(1, len) {
// j satisfies: 0 <= j <= i;
let mut j = i;
unsafe {
// `i` is in bounds.
let read_ptr = buf_v.offset(i) as *const T;
// find where to insert, we need to do strict <,
// rather than <=, to maintain stability.
// 0 <= j - 1 < len, so .offset(j - 1) is in bounds.
while j > 0 &&
compare(&*read_ptr, &*buf_v.offset(j - 1)) == Less {
j -= 1;
}
// shift everything to the right, to make space to
// insert this value.
// j + 1 could be `len` (for the last `i`), but in
// that case, `i == j` so we don't copy. The
// `.offset(j)` is always in bounds.
if i != j {
let tmp = ptr::read(read_ptr);
ptr::copy_memory(buf_v.offset(j + 1),
&*buf_v.offset(j),
(i - j) as uint);
ptr::copy_nonoverlapping_memory(buf_v.offset(j),
&tmp as *const T,
1);
mem::forget(tmp);
}
}
}
}
New code:
fn insertion_sort<T>(v: &mut [T], compare: |&T, &T| -> Ordering) {
let len = v.len() as int;
let buf_v = v.as_mut_slice();
// 1 <= i < len;
for i in range(1, len) {
// j satisfies: 0 <= j <= i;
let mut j = i;
unsafe {
// `i` is in bounds.
let read_ptr = &buf_v[i] as *const T;
// find where to insert, we need to do strict <,
// rather than <=, to maintain stability.
// 0 <= j - 1 < len, so j - 1 is in bounds.
while j > 0 &&
compare(&*read_ptr, &buf_v[j - 1])) == Less {
j -= 1;
}
// shift everything to the right, to make space to
// insert this value.
// j + 1 could be `len` (for the last `i`), but in
// that case, `i == j` so we don't copy. The
// offset by j is always in bounds.
if i != j {
let tmp = ptr::read(read_ptr);
buf_v[j + 1, ..].copy(buf_v[j, i]);
buf_v.write(j, tmp);
mem::forget(tmp);
}
}
}
}
The resultant code is more concise, and the use of slicing syntax makes it more clear what exactly is happening. Note that an explicit length computation is completely dropped, as it falls naturally out of the range that the slice is over.
In general, this API creates a much nicer migration path from safe slice code to unsafe slice code. Just coerce the slice to a rawslice, shadow the variable, and the rest of the code should stay the same. This minimizes the chances of a translation error causing a bug. If bugs do occur, commenting out the shadow instantly restores safe code for debugging purposes. Ideally, developers should be more likely to start with safe code if they know that the migration should be painless.