Deprecate or Lint CString::as_ptr


#1

Everything seems fine, the docs are clear:

fn as_ptr(&self) -> *const c_char

Returns the inner pointer to this C string.

The returned pointer will be valid for as long as self is and points to a contiguous region of memory terminated with a 0 byte to represent the end of the string.

Dereferencing it or passing it to a foreign function requires unsafe.

Still, CString::as_ptr() leads to common use after free errors in Rust.

It is a recurring gotcha, and we can’t ignore this just because it requires an unsafe { } block to produce.


#2

Maybe we can develop a lint that’s smart enough to catch the dangerous uses of this. It would certainly help many.


#3

Interop with FFI in general seems to be an increasingly common stumbling block. There’s little out there about how to, for example, return a generated C string or deal with arrays.

I’m trying to whip an FFI binding generator into shape so we can just tell people “run this on your crate and it will generate the correct .h/.py/.rb binding for you”, though that doesn’t help with the above.

Maybe we also need a standard (or at least, blessed) FFI module/crate that has helpers for the less trivial but very common use cases, and have the book and docs point to that. At least then we’ll have something to point people to on SO other than saying “uh… you’re kinda on your own, good luck.”


#4

I can see two reasonable models. One is to call CString::into_ptr, then call CString::from_ptr to free the string. The other is to encourage people to write bindings like extern { pub fn strlen(s: &CStr) -> libc::size_t; } instead of using raw pointers.

&CStr doesn’t actually work correctly at the moment, but it should be easy to fix. Some sort of CSlice would probably be a good companion to CStr.

I think the "construct a CString, then call as_ptr()" pattern is floating around simply because there isn’t a better alternative at the moment; neither of the above models is usable with stable Rust. If you’re familiar with C++, it isn’t such a big deal: std::string has a data() method which gets used all the time. That said, only a C++ programmer would consider this pattern a good idea.


#5

So actually, it turns out fixing CStr is harder than I thought… but if anyone wants to experiment, it’s possible to approximate a fixed &CStr with something like the following:

#[repr(C)]
pub struct CStrRef<'a> {
    p: *const libc::c_char,
    phantom: std::marker::PhantomData<&'a libc::c_char>
}
pub fn ref_from_cstring(s:&CString) -> CStrRef {
    CStrRef{p: s.as_ptr(), phantom: std::marker::PhantomData}
}
extern {
    pub fn strlen(s:CStrRef) -> libc::size_t;
}
fn main() {
    let c = CString::new("asdf").unwrap();
    let r = ref_from_cstring(&c);
    println!("{}", unsafe { strlen(r) });
}

This pattern allows using FFI without directly screwing around with raw pointers.


#6

@eefriedman: The primary purpose of CStr is to be able to write efficient safe wrapper functions around FFI functions.

mod ffi {
  extern {
    fn strlen(s:*mut libc::c_char) -> libc::size_t;
  }
}
pub fn strlen(s: &CStr) -> libc::size_t {
  unsafe { ffi::strlen(s.as_ptr()) }
}

Being able to directly use &CStr in FFI declarations would be nice. The idea was that CStr should be an unsized type:

struct CStr { data: libc::c_char }
impl !Sized for CStr {}

Unlike the current DST CStr, with an unsized type, &CStr would be a thin pointer that can be used for FFI.

However, Rust currently doesn’t support unsized types – it only has statically sized types (Sized) and dynamically sized types (DST). Since removing Sized from a type is a breaking change, CStr was made dynamically sized so that it can be turned unsized in the future.

Using a ...Ref<'a> instead of a real reference avoids the need for unsized types in the language, but it shouldn’t be added to the standard library as long as there’s a chance we’ll get real unsized types.