Inlining policy for functions in std?

In std/core/alloc there are many trivial functions, but not all of them have #[inline]. Is that just an effect of adding the attribute whenever authors felt like, or are there some specific concerns or trade-offs that dictate what should have #[inline]?

The functions I'm talking about are like:

fn foo(&self) {
    self.inner.foo()
}
fn into(self) -> Foo {
    Foo { inner: self }
}
fn as_ref(&self) -> &Foo {
    &self
}
fn to_boxed_foo(self) -> Box<Foo> {
    Box::from_raw(Box::into_raw(self).cast())
}
fn eq(self, other: X) -> bool {
    &self.0 == &other.0
}

I assume that for functions like these inlining has no downsides from either performance or code size perspective, since they optimize to nothing, or at worst the same call with just a field offset.

I'm not sure how much cost there is in compilation time or libstd size. Does that matter? Does that cost anything for code that doesn't call these functions?

Would it be OK to make a PR that just sprinkles #[inline] on all such functions? Can it include functions that are potentially rarely used or for less popular platforms?

2 Likes

IIRC, @alexcrichton has been a hawk about applying inline annotations because of their deleterious effects. But I can't quite remember the details. It's possible it was only about inline(always). @alexcrichton, might you chime in here?

I think the main guideline is that non generic functions needs to be mark inline to allow inlining at all? Ie, most of the time we just try to avoid making the function impossible to inline.

As an aside, I wish we had a better story here: cross-crate absence of inlining for non-generic functions is a giant perf pitfall for novice/intermediate users, and a source of busy-work for advanced users.

1 Like

Some more info about #[inline] here: Enable cross-crate inlining without suggesting inlining - #6 by bluss - help - The Rust Programming Language Forum especially the details about copying the function to every codegen unit - that's quite costly.

Now I stupidly repeat that we need to share this knowledge centrally about how inlining works (stupid because I keep saying it and not doing anything about it, more than talking on the forum). So where do we put this knowledge? Without writing a reference, because the non-guaranteed behaviour is important too.

3 Likes

That's my understanding as well -- that for generic functions that end up monomorphized and thus likely are available in the codegen unit anyway, the library tends not to bother.

The whole thing reminds me of the must_use conversation -- if there's an attribute that would would be put pervasively on most things in libraries, then the answer is probably not to do that, but to find a way to change rustc to have better defaults.

What to replace #[inline] attribute with is probably a bigger topic. But from discussion so far it seems that for now it's a good idea to add it whenever it could be useful.

Rust API guidelines feels like an appropriate place.

1 Like

Based on that thread, it sounds like "codegen into a singular referencing CGU" (with or without inlinehint) would be a good policy for many of these functions... if only rustc offered it.

(This whole thing is such a hack. If only LLVM supported sharing a single module between multiple threads, there would be no need for codegen-units at all. But it doesn't.)

3 Likes

Has there been any discussion of what it would take to fix that in LLVM? Eliminating CGUs would be a massive win for configuration simplicity and optimization.

4 Likes

Lots has already been said here (and before, and before that), so no need to add too much here.

When faced with the question of whether or not to #[inline] I typically think of "what will happen if I don't do this". If it's a non-generic function, that basically means it won't get inlined across crates unless full crate graph LTO is enabled. If it's a generic function then it will already be monomorphized on use and is a candidate for inlining so long as crate-local ThinLTO is enabled (which it is by default).

So for the cases originally brought up in this thread, the consequence of not using #[inline] is they typically won't get inlined in cross-crate scenarios. Most of the time this isn't really an issue, most people aren't writing the equivalent of <[T]>::len which is practically required to be inlined.

Of course though it's also worth considering the cost of using #[inline], which is that it's a compile time cost because more code is translated elsewhere. Unfortunately one function using #[inline] has practically zero compile-time cost, but if everything is inlined then it adds up quite fast.


Unfortunately in my experience the optimal way to evaluate #[inline] is to intimately understand the compiler, codegen units, LLVM, and general compile-time cost. Naturally that's an extremely tall order and means that in practice #[inline] is both used too much and too little. Ideally the compiler would do better here, but to me at least it's not clear what could be done that wouldn't just be more expensive compiler analysis.

12 Likes