pub struct ArrayStr<const N: usize> {
bytes: [u8; N],
}
impl<const N: usize> core::ops::Deref for ArrayStr<N> {
type Target = str;
fn deref(&self) -> &Self::Target {
unsafe { core::str::from_utf8_unchecked(&self.bytes) }
}
}
impl<const N: usize> core::ops::DerefMut for ArrayStr<N> {
fn deref_mut(&mut self) -> &mut Self::Target {
unsafe { core::str::from_utf8_unchecked_mut(&mut self.bytes) }
}
}
impl<const N: usize> TryFrom<&str> for ArrayStr<N> {
type Error = core::array::TryFromSliceError;
fn try_from(value: &str) -> Result<Self, Self::Error> {
Ok(ArrayStr { bytes: value.as_bytes().try_into()? })
}
}
Rust str
s are encoded as UTF8, so one "character" (Unicode Scalar Value) can span a variable number of bytes. This would greatly limit the usefulness of ArrayStr
. But what use-case do you have in mind?
There are several implementations of this in 3rd party crates. heapless::String
, for example.
Bringing it into core
/std
has many repercussions, including: does it really make sense to use a specialized data structure instead of String
? What if you could use a byte array as the backing storage for a String
instead, since String
is generic around an Allocator
?
This is where things like a proposed storages API come into play: Pre-RFC: Storage API
It would sure be nice for whatever solution emerges to work with core
and be able to replace things like heapless::String
.
There is a hole it seems, since [T]
is to [T; N]
as str
is to what? However, most of the time I work with small arrays I want exactly N
elements, but with short strings I typically want the length to vary. There are lots of useful functions that could take an array of three floats as an argument. What are some useful functions that would take as arguments Unicode strings that required exactly 3 bytes to store?
An observation: it would be possible to make use of a pure (no extra length value) ArrayStr
as seen in the original post — by leaving it to the application to either have a compile-time-known length or to keep length separate and fill the unused space with any ASCII value of its choice (such as '\0'
or ' '
or '.'
depending on the nature of the application).
For example, UIs often have uses for truncated strings (fitting as much text as possible into a fixed label area), and it's always possible to truncate a UTF-8 string to any maximum byte length; you just won't then end up necessarily filling the buffer (which can be handled by appending non-visible filler characters). Truncating by byte length is certainly not the same thing as truncating by visually rendered length, but, if you are in a situation where you care about avoiding heap allocations — or avoiding DoS attacks by arbitrary length strings — you are likely willing to take the caveat that the string might be truncated to shorter than the available visual space (and you can minimize that by choosing a larger maximum byte length than would be necessary when the string is pure printable-ASCII).
I'm not proposing any specific change to std
; I just wanted to point out that “UTF-8 is variable length” does not logically imply that “array-strings are useless”.
I've previously wanted to see such a type in core that impls Unsize<str>
, since one can't manually impl Unsize
on such a type in a downstream library. But unsize
is an unstable feature (for now), so it remains a minor nit.
I have read this discussion and would like to share my opinion about using ArrayStr to represent str as an array in the standard Rust library. Personally, I see some potential advantages and scenarios for using this type. First, using ArrayStr could be useful when we need to work with fixed length strings, especially in the context of working with protocols or data formats that require an exact string length. This would allow us to have static guarantees about string length at compile time, which contributes to security and prevents runtime errors. In addition, representing str as an array can improve performance in some cases, especially when dealing with a large number of short strings. This is due to reduced memory management overhead and reduced fragmentation. However, I also understand that adding ArrayStr to the standard library may cause some difficulties and require additional analysis and testing. It is important to keep in mind that adding a new type may entail increasing the size of the standard library and increasing the complexity of support and development. In general, I believe that the proposal to use ArrayStr to represent str as an array has its advantages and is worth discussion and research. It would be interesting to hear the opinions of others on this.
It's weird that b"hello"
is a reference to a sized type, but "hello"
is not.
Indeed. It didn't used to be that way before Rust 1.0, but this RFC changed it: 0339-statically-sized-literals - The Rust RFC Book
See also: B in bstr - Rust
With the B
macro, could you write &*b"ab"
to get the same effect?
I don't understand your question. let xs = vec![&*b"a", &*b"ab"];
does not compile.
Also, bstr::B
isn't a macro. It's a function.
I misspoke but I see the answer is no.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.