Register attribute

bjorn3 · December 27, 2022, 10:04am

Rustc has a MIR inliner which runs before monomorphizing generic functions, but that one still needs MIR, which is only available for generic and #[inline] functions for the reasons I mentioned.

tschuett · December 27, 2022, 10:08am

But that does not work across crates.

bjorn3 · December 27, 2022, 10:36am

The MIR inliner works just fine across crates. It is just limited to generic and #[inline] functions because we don't encode MIR for other functions for performance reasons.

tschuett · December 27, 2022, 1:19pm

Are we talking across each other? My dream is that crate A drops some Mir on the disk, crate B reads the Mir and uses it for optimisations. AFAIK the Mir inliner inlines only within a crate.

bjorn3 · December 27, 2022, 1:58pm

The MIR inliner works between crates too. For example for Rust Playground if you view the MIR in debug mode it shows an explicit call, while in release mode there is no call and most debuginfo scopes point to the standard library.

dead-claudia · December 27, 2022, 6:16pm

In my experience, humans almost never achieve better than compilers. And when they can, it's generally in a hand-written assembly routine, not in a language like Rust.

And yes, LLVM will prefer registers if it can at all help it, and it will even push old registers to stack to make room. It costs 0 cycles to use an existing register, while it costs multiple to even load from L1 cache. The only case where I've seen myself being able to do much better is in cases where compilers in their effort to minimize registers used end up putting in moves rather than filling out scratch registers first (and ultimately generating suboptimal code), but admittedly, even this is rare.

tschuett · December 27, 2022, 6:45pm

I am starting to see what the Mir Inliner is doing.

I also agree that it is hard to beat compilers. The only chance that I had was to help him.

robinm · December 27, 2022, 8:06pm

Just as a data point against adding such anotation, the register keyword was deprecated in C++11 (IIRC) and definitively removed in C++17.

burjui · December 28, 2022, 11:22pm

When topics like this arise, this phrase always comes to mind: "Use a profiler to find bottlenecks, not the sixth sense". For the past six weeks, I've been improving my risky library for encoding RISC-V instructions, and spent a few hours optimising my own tailored bit field implementation. Sure, I've got great performance, but the main reason I started optimising it it in the first place was speeding up code generation in my toy compiler. Turned out, it had almost no effect, because most of code generation time was spent on manipulating SSA, not generating the instructions. Not a totally wasted effort in this case, but still completely misguided and I would argue, unnecessary.

As for PGO though, I'm not sure it's the best solution. It usually improves performance, but sometimes makes it worse, and it's very hard to find the reason why. And, if I understand correctly, its effectiveness is limited by the type of load. On the other hand, manual optimisation's effectiveness 99% of the time depends only on the fragments of code you touch, so it's more robust, even though labour-intensive.

My overall opinion is that it's better to not interfere manually with the compilers' optimisers, for they are much too complex to get any meaningful result from such an effort, and the best course of action is to either leave them alone or, because optimisers want to be happy too, give them some love and try to eliminate the pathological edge cases. The latter obviously requires highest expertise in the field of compiler development, but there's no way around it. Or, alternatively, teach a huge AI model to optimise code: I heard, it works like magic most of the time, and only occasionally produces totally useless and harmful garbage

system · March 28, 2023, 11:23pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Inline assembly constant zero optimizations compiler	0	257	December 3, 2024
Expose LLVM integer intrinsics for arbitrarily-large integers compiler	16	1929	August 31, 2023
Why does rustc implement SSA? compiler	2	484	October 12, 2024
Custom LLVM calling convention compiler	10	7295	March 25, 2019
Specifying target-specific LLVM macros	12	730	September 5, 2024

Register attribute

Related topics