work in this way so that it can be reliably passed through FFI to an equivalent C struct. I am aware you can just turn the string into a C string, then map it in C, but I would rather avoid null terminated strings.
Is this thought silly? Does it feel pointless? Would it be a nightmare to implement? Would it also be worth doing for array slices? Would array slices come for free?
IIRC strings in Rust are UTF-8 and the char type isn't necessarily a byte, where the encoding of C strings is not specified and a char is always a byte. There's a lot more that str does under the hood than provide a length-specified string.
You can already get a *const u8 out of an &str, which is an FFI-compatible type, so I don’t think the hypothetical freedom of implementation for C’s char is really an obstacle here. In C23 there’s a named type char8_t for “UTF-8 code unit”, but it’s just defined as a standard alias for unsigned char. That would be close enough if we wanted to make &str FFI-compatible, as only a slight step beyond making &[u8] FFI-compatible. (I agree that plain char, which might be signed, doesn’t quite match up, however.)
The standard library can generally assume this kind of stuff because it's shipped together with the compiler, so if the compiler ever changed the representation the stdlib would be updated at the same time.
I'm aware, but ironically enough the locations where I've encountered it have no local checks to ensure the layout is correct. I can't imagine it wouldn't fail CI, but it's not the best case scenario where it would merely fail to compile.
I don't think Rust should make str FFI safe, because it would be a footgun for FFI novices, making them think just using str in an extrrn C function would magically convert it into/from a nul-terminated string instead of realizing they would need that struct in the C side.
I guess it may be fine to have str as FFI safe but with a lint that explains this possible pitfall enabled by default.
AFAIK Rust generally avoids having lints like that that are basically "yes I have read the rules" lints rather than actual problems in your code that would be applicable no matter your experience level.
The general idea is that you shouldn't need to allow any lints every time you use some feature (unless using the feature is basically always a bug like std::mem::uninitialized() is)
Isn't this subject to ABI rules? For example, a &str argument may be passed in a wide register, as a pointer, two registers, or passed on the stack depending on things like argument position and such which may be different than a bare sequence of int and const char* arguments.
Also, for the specific example given, %.*s requires the precision be an int, so truncation may occur.
Part of this effort would be deciding on a lowering for &strspecifically in extern "C" functions*, and not just embedded in structs, without necessarily affecting the representation in Rust-ABI functions. If we wanted to say “this is always passed the same way as struct { char8_t *start; size_t length; } would be”, we could do that (and by “we” I mean “the FFI working group”).
* and C-unwind, and any other ABI expected to match up with some kind of C or C++ header declaration.