TL;DR: add std::marker::PhantomUnsized, an "unsized ZST" marker type.
This allows for custom thin pointers to data. For example:
struct CStr {
raw: c_char,
unsize: PhantomUnsized,
} // and CStr things
struct Utf8Codepoint {
raw: u8,
unsize: PhantomUnsized,
}
impl Deref for Utf8Codepoint {
type Target = str;
fn deref(&self) -> &str {
let len = utf8_length(self.raw);
let slice = slice::from_raw_parts(&self.raw, len);
str::from_utf8_unchecked(slice)
}
}
impl Utf8Codepoint {
fn as_char(&self) -> char {
self.chars()
.nth(0)
.unwrap_or_else(|| unsafe { debug_unreachable!() })
}
}
How is this different from extern type?
I'm not really certain. Mostly, I see it as the difference between extern type being the void in void* (i.e. "something I know nothing about") and PhantomUnsized being for things like turning *const [T] into *const (T, PhantomUnsized) where it is more cleanly "*const T but unsized". Also, PhantomUnsized is a smaller change that could be pushed through quickly (in theory).
[I currently have a struct Character { raw: str } in windex, but have been considering if making &Character a thin pointer would be better. Probably not, thinking about it after writing this, but PhantomUnsized is still an interesting minimal addition.]
I think it would be better if Custom DSTs were introduced into the language because this can only handle the case where there is no meta-data (thin pointers). But this may have its uses as a short term solution.
As far as fat pointers go, I really liked the idea to support const generic erasure.
e.g. you have &MatrixSlice<4, 4> for a 4x4 matrix, and &MatrixSlice<dyn, dyn> for a matrix where the dimension data is "hoisted" from static to the fat pointer metadata. (So [T] is [T; dyn].)
Ignoring syntax issues for now (the time I saw this suggestion, it was brought up that e.g. &&[T; dyn] is ambiguous to which reference should be fat), are there DST use cases that wouldn't be covered by one of these?
I thinkextern type, PhantomUnsized, and "dyn const" cover the three types of custom DST (unknowable, inline/thin, and fat, respectively). And each of these is (in theory) fairly simple, in comparison to the full-fledged custom DSTs.
I wouldn't call dyn const simple, its semantics can be confusing and rather arbitratry. I would not like to see that even if we never get Custom DSTs in any other form.
So this type is unsized but it also has size "at least 1" (similar to e.g. (c_char, [c_char]). Does that mean if we apply our rules for references being dereferencable etc., that an &CStr must point to at least 1 byte of valid memory? That the compiler is allowed to insert spurious loads of that bytes? That it is UB to mutate that byte because it is pointed-to by a shared reference?
I think all of that is the right semantics. (It's looser than the current fat pointer CStr as well.)
And that I'm thinking about it now, ([T; 0], PhantomUnsized) would be an interesting translation for VLA. It basically says the same thing as the VLA "trick" in C: align this to T, may contain data after the "main" sized part of the struct that you have to handle unsafely.
For the most part, I think the obvious semantics of "make this ?Sized, use unsafe to track whatever data is in the unsized portion" works correctly, which is why I think this is a fairly minimal addition.