CStr::in_bytes method

For now, CStr only has one safe constructor: CStr::from_bytes_with_nul. In my experience, I generally receive c strings without knowing their length, and have to check it myself beforehand:

let mut buffer = vec![0; 128];
create_cstr(&mut buffer);
let cstr = CStr::from_bytes_with_nul(&buffer[..=buffer.iter().position(|b| b == '0').ok_or(())?]);

This means we're checking the contents of buffer twice (and the API sucks to call!)

CStr::in_bytes could implement this common pattern, and perform a single scan through the input:

// Faster, *and* easier to call :)
let cstr = CStr::in_bytes(&buffer).map_err(|_| ())?;
/// Finds a C string in a byte slice
/// The string will contain the range from the start of the bytes to
/// the first nul byte.
/// If the slice does not contain any null bytes, an error is
/// returned
pub fn in_bytes(bytes: &[u8]) -> Result<&CStr, InBytesError>
// Or
pub fn in_bytes(bytes: &[u8]) -> Option<&CStr>

My sense is that this could just go through our standard FCP process. At first blush at least, the idea sounds good to me.

This method could be viewed as a constructor accepting truncation w.r.t. the given byte slice, so an interesting approach would be to bundle this information within the FromBytesWithNulError, much like with Mutex un-poisoning:

    .or_else(|err| err.to_truncated_cstr())?

This duality between the two methods goes the other way too - Ok calls to from_bytes_with_nul would all be functionally equivalent to an in_bytes call. from_bytes_with_nul just adds an assertion that bytes.len() == s.len()).

I'd say the difference in that from_bytes_with_nul is appropriate for inputs to FFI from Rust code, and in_bytes is best for outputs from C code (which is treating the position of the nul as an output!) In this sense, I think having separate methods is actually the most ergonomic. Maybe we could document the duality on CStr.

I'll have a look at the FCP process ^^ hopefully I can write it up fairly easily

1 Like