Needs more Cow-bell?

Recently I had the good fortune to get a good look at the Rust codebase. Fun times! :smile:

One thing I noticed is that there are many places where owned Strings are used, even though in the majority of cases, a &'static str would suffice. Many of those strings are just allocated to be used in format!ting or to be written out somewhere.

Since I am a fan of Cows, I wonder why we don’t use them in the compiler more often? So I wrote a microbenchmark to see how the performance differs. The results? Allocating is very fast, but just using a Cow is even faster (to the tune of 2ns per Cow vs. 33ns per allocation, this benchmark does not include e.g. cache effects, so the actual effect is likely underreported).

So, Rust developers, next time you reach for a .to_owned() or worse, .to_string(), consider using Cow.

(Of course, this likely has little effect on the actual runtime of the compiler, because those strings are usually not on hot code paths. Still, with allocation you never know if you tread on other code’s toes).

4 Likes

Are you planning on sending a PR for changing the Strings to Cows?

Not yet. That’d be potentially a lot of work. I currently wonder if I can write a lint that identifies places where Cow might be beneficial. That would automate the hardest part of it.

2 Likes

Nice. If I understand correctly, the pattern exemplified in your microbenchmark works because strings in the code are already implicitly 'static &str before to_owned() receives them. It adds some noise to the code, but if you’re used to seeing the Perl-like symbol spew, you can get used to it.

It using to_owned or allocations a bottleneck in rustc?

No. It’s not a bottleneck. Just noise in the profile (but think of the caches!). And yes, it’s the allocation, to_owned() just allocates and makes a copy. Also your understanding is only half of the truth: In fact Cow works whenever there is a lifetime bound that all borrowed instances can comply with.

However, for the lint, to keep complexity low and still get a lot of benefit, I’m going to concentrate on 'static first. Otherwise the lint would have to infer a possible lifetime bound. I might be able to do that some day if I think long and hard about it, but I think going for the easy targets first will give us more bang per buck for now.

I just profiled rustc when building a package (rust-cpython) using valgrind --tool=callgrind and found that while rustc spends 12% of its time in je_mallocx and je_sdalloc each, with about 6 million calls apiece. So I think there might be something to squeeze out of here.

2 Likes

Thank you for actually measuring.

Let’s also assume that adding or removing allocation will have non-linear effects on performance (where removing some allocation can give you disproportionate performance benefits).

rustc’s malloc problem is because of the terrible implementation and layout of Substs. This is annoying to fix because of API problems, but someone should do it.

I agree that owned copies of static strings probably account only for a small part of allocations. Still, it’s an easy target for a lint, and a pattern I see over and over in Rust code, so this will not only benefit rustc, but everyone who uses the lint.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.