When converting from Rust strings to C strings, we have to ensure that there are no interior nulls, which cannot be represented in C strings.
Currently, we do this via the CString type in methods like from_slice, which panic if an interior null is found.
Since interior nulls are relatively rare, that means there are rare panics lurking in anything that winds up calling system APIs, such as IO.
For things like IO, however, we could easily propagate such errors using the InvalidInput variant for std::io errors. That would have essentially zero ergonomic impact on users of IO and similar APIs, but might offer somewhat more graceful handling of the situation.
There are a couple of questions to ask:
Is there a reasonable way for clients to handle InvalidInput errors other than simply panicking themselves? That is, would this change actually benefit code in practice?
Would the ergonomics of CString::from_slice take too much of a hit for code that uses it directly?
Thinking a bit more, it’s probably possible to change the behavior within std from panicking to producing an error variant in the future (this is broadly considered a backwards-compatible change). So this may just be a question about the right “default” behavior for CString.
I favor a domain specific error with a FromError impl for converting into std::io::Error.
A NulError can provide programmatic access to the position of the first NUL encountered.
CString::from_vec also needs a move-friendly Err value with the original vector.
@aturon, I don’t see how can CString::from_slice be changed to return Result without breaking backward compatibility?
Are there cases where one would get NUL in a C string, other than data corruption or an attack?
I’ve encountered the problem of internal NULs vs C strings only in PHP filesystem APIs, but there it’s pretty much guaranteed to be an attack attempt — http://cwe.mitre.org/data/definitions/626.html
Even if Rust’s stdlib handled such strings safely and gracefully, my app probably wouldn’t, so to me a panic! in such case seems fine.
The hazard here is that NUL is so rare in real-world string data that it is easy to get overlooked during development, unless the developers consciously test invalid inputs.
Imagine you’ve got this library interface to work with:
fn set_text(text: &str) {
let c_text = CString::from_slice(text); // panic lurks here
unsafe { ffi::set_text(c_text.as_ptr()) };
}
Unless you look at the code, you never know that the input string has some additional constraint, which is never enforced in pure Rust code because it is only imposed by C libraries. The creator of that code might not even have been aware of the problem: the from_slice call compiles, and it works with the “normal” strings they tested it with. A Result would provide enough of a speed bump to consider the appropriate failure mode, or at least there is an unwrap() in the code to indicate the potential panic.
As I understand the prevailing convention, panics happen only as a result of programmer errors. The way to inform the client about a potential failure when working with external data is is to return a Result.
I’ve actually had to re-implement the null checking part of from_slice and then call from_slice_unchecked to prevent this panic so I’d really appreciate this change.