Short String optimization

scottmcm · September 24, 2018, 8:46pm

I'm not sure that's true. I also wouldn't want to be in a situation where people are writing their own strings around Vec<u8> because they can't trust that the standard one is no-fail in the places they care about.

withoutboats · September 24, 2018, 9:02pm

My recollection was that we decided would never do this because we wanted String and Vec to avoid complex optimizations that make their semantics more difficult to understand (preferring to keep them “canonical implementations”). Apparently this decision never translated into documentation guarantees.

mcy · September 24, 2018, 9:30pm

Can you elaborate? I imagine an SSO String would have an explicit way to turn a &str or whatever into something on the heap. If they really, really care, a user could do this:

struct HeapString(String);
impl HeapString {
  fn new(str: &str) -> Self {
    let string = String::in_heap(str);
    Self(string)
  }
  fn into_inner(self) -> String { .. }
  unsafe fn new_unchecked(str: String) -> Self { .. }
}
impl Deref for HeapString {
  type Target = String; // or str or whatever
  // ..
}

I imagine that in most branches involving is_heap would predict true anyway...

Sure, I can imagine wanting that. I think Rust tends to treat String a lot more like e.g. Java StringBuffer than std::string anyways. I wonder if it's possible to measure whether this would even be a notable improvement...

notriddle · September 25, 2018, 12:51am

Then all of the String functions would have to check whether the String is heap-allocated or inlined, despite the HeapString guaranteeing that the branch only goes one way. It's an anti-pattern called an abstraction inversion.

pepp · September 25, 2018, 8:29am

You can turn this optimization on a case-by-case basis by using SmallString which is based on SmallVec. I would argue that it is better than turning SSO for everything. In many cases SSO is pessimisation. It is in many ways similar to choosing between HashMap and BTreeMap.

The only argument that has any solid ground here is that someone measures that SSO will improve performance of majority of existing rust code.

pepp · September 25, 2018, 8:33am

Probably three years too late. I joined the discussion then myself arguing that it is safer and more future-proof to not define String as a wrapper around Vec and leave it as an implementation detail.

newpavlov · September 25, 2018, 9:02am

I think that without custom move constructors it will be a performance regression, not only because of the additional branching, but also because String moves will copy the whole struct, including often unused [u8; N]. Plus I don’t think that SSO will result in a cache friendlier code for most of Rust codebases, because thanks to borrow checker it’s idiomatic in Rust to pass around &str instead of pervasively cloning substrings as you’ll do in C++.

mcy · September 25, 2018, 2:43pm

Yeah, there's no particularly good way to optimize those calls in today's compiler. I mumbled something about how is_heap should be annotated to predict true (Does Rust even have a way to do this? Clang has some kind of non-standard attr I think.) or whatever, but I have nothing concrete to say as to the performance hit.

Yeah, that's what I've been saying. Absent that, I don't think making std::string::String an SSO string is worth it.

I'm glad to see the discussion this has generated. I think the issue is cleared up for me!

matklad · September 25, 2018, 4:01pm

Yep, as an unstable intrinsic: rust/src/libcore/intrinsics.rs at ae366637fedf6f34185e54fc7b2d725b1a458ff6 · rust-lang/rust · GitHub

gnzlbg · October 1, 2018, 9:37am

@newpavlov String's size w/o the SSO can be the same, such that in both cases, when the whole struct is moved via a memcpy, the same amount of work is done, e.g.,

union String {
    (usize, usize, *mut u8),  // (ptr, size, capacity)
    ([u8; size_of::<usize>() * 3), [sso buffer..., sso size, sso bit]
}

where one:

uses 1bit to signal whether the SSO is active (e.g. the highest pointer bit),
some bits to store the size (either the second highest pointer byte, or one crawls this in the highest pointer byte along with the SSO bit),
the rest of the bits for the SSO buffer.

On 64-bit targets, if you use 1 byte for the SSO signaling bit, and 1 byte for the size, you get 22byte wide buffer. If you store the size and the sso signaling bit in the same byte, then you get 23 byte wide buffer, but you need to mask the signaling bit every time you want to extract the size.

With move constructors, moving the String could be less work when SSO is active (branch to detect SSO, branch on the SSB length, plus only moving the relevant part of the small string buffer) than when the buffer is heap allocated, but without move constructors moving the String when SSO is active is never more work than when the string is heap allocated unless one makes it larger than 3 pointers to get a larger small buffer size (AFAIK most C++ string implementations don’t do this).

Whether all the logic for deciding what to move and how to do so is more performant than just memcpying 3 pointers… is unlikely. This only becomes relevant if you use small strings with larger buffers, but the “beauty” of C++ SSO is that one can perform it without increasing the actual size of the std::string object.

system · March 25, 2019, 8:30am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Feature request: make every String a smartstring libs	15	1238	October 22, 2023
Small string optimization: remove as_mut_vec libs	71	13997	March 25, 2019
Why not a `ArrayStr<N>` for `str` as array for slice in std? libs	12	888	August 14, 2023
Wild idea: deprecating APIs that conflate str and [u8] libs	59	3620	November 12, 2020
Const generic array sizes as its own mini-stabilization? libs	5	739	June 1, 2021

Short String optimization

Related topics