@newpavlov String's size w/o the SSO can be the same, such that in both cases, when the whole struct is moved via a memcpy, the same amount of work is done, e.g.,
union String {
(usize, usize, *mut u8), // (ptr, size, capacity)
([u8; size_of::<usize>() * 3), [sso buffer..., sso size, sso bit]
}
where one:
- uses 1bit to signal whether the SSO is active (e.g. the highest pointer bit),
- some bits to store the size (either the second highest pointer byte, or one crawls this in the highest pointer byte along with the SSO bit),
- the rest of the bits for the SSO buffer.
On 64-bit targets, if you use 1 byte for the SSO signaling bit, and 1 byte for the size, you get 22byte wide buffer. If you store the size and the sso signaling bit in the same byte, then you get 23 byte wide buffer, but you need to mask the signaling bit every time you want to extract the size.
With move constructors, moving the String could be less work when SSO is active (branch to detect SSO, branch on the SSB length, plus only moving the relevant part of the small string buffer) than when the buffer is heap allocated, but without move constructors moving the String when SSO is active is never more work than when the string is heap allocated unless one makes it larger than 3 pointers to get a larger small buffer size (AFAIK most C++ string implementations don’t do this).
Whether all the logic for deciding what to move and how to do so is more performant than just memcpying 3 pointers… is unlikely. This only becomes relevant if you use small strings with larger buffers, but the “beauty” of C++ SSO is that one can perform it without increasing the actual size of the std::string object.