Debug info for copy/move operations

jsgf · September 22, 2025, 8:49pm

At the moment, it can be very hard to identify compiler-generated copy/move operations in the output code. It's possible to see a sequence of load/store operations, or calls to memcpy, but you can't tell its a copy/move of a given type without deep analysis. In particular, it's very difficult to work out how much CPU time is spent on move/copies, and where they are happening.

I would love it if the compiler generated debug info for the operations it generates. This would identify the specific instructions involved in the copy/move, and what type is being moved. For example, it could generate calls to an artificial inline function called something like core::intrinsics::compiler_move::<T>.

Of course if the move/copy is completely elided then nothing should be generated. It should probably also skip (or have the option to skip) very small move/copies, esp for small Copy primitive types (I imagine the debug info would explode if every usize/u32/etc copy were marked).

So:

Is this something which does actually exist, and I've just missed it?
If not, is there an existing issue asking for it?

I'm poking around in the rustc internals right now, and I think I have some idea how to implement it (and not seeing anything obviously already there for it, so I think my answer to 1 is "no").

Vorpal · September 22, 2025, 10:32pm

I saw someone mention trying something like this a few weeks ago, maybe on zulip? I don't know if that was you, but otherwise it would perhaps be good to coordinate your efforts.

I think this sounds like a good idea. I'm generally in favour of anything that gives the developer more insight into the software unless it has unacceptable overhead.

Will this "fake" frame survive optimisation and LTO though?

ais523 · September 23, 2025, 8:11am

I think a good heuristic here would be "mark any move/copy that touches memory". When the optimisation level is 1 or higher, these moves and copies of trivial small things are normally done entirely in registers and don't affect memory at all. (Meanwhile, if a small thing does get moved or copied from memory, it is nice to have it marked to make it easy to distinguish the move from a spill – the two are difficult to distinguish by looking at the generated assembly.)

You would probably have to turn off the feature at optimisation level 0, which does actually store pretty much everything in memory pretty much all the time, and thus materialises all the moves/copies: it would add a lot of overhead and the only benefit to optimistion level 0 is that it compiles quickly, so it would probably prefer to not have the overhead involved.

This conceptually feels like it wants to be indicated in debug info as a code location, rather than as a fake stack frame (e.g. you set the source file location for the instructions doing the moves to core/intrinsics/compiler_move.rs) – debugging, profiling, etc. tools tend to work with that information already, and compilers are already used to annotating inlined code like that. But ideally the debug info would list both the synthesized file path that indicates that it's a move, and the actual location in the source code that caused the move, and I don't think existing debug info formats have good support for doing that.

jsgf · September 23, 2025, 5:33pm

I saw someone mention trying something like this a few weeks ago, maybe on zulip? I don't know if that was you, but otherwise it would perhaps be good to coordinate your efforts.

No, not me. Do you have a pointer?

jsgf · September 23, 2025, 5:43pm

Yeah that would be nice, but it seems like it would be hard to implement - the decision to use memory or not would be the optimizing backend's, whereas I imagine the bulk of the implementation of this feature would be in the middle of the compiler. I think we'd just have to use the size from layout to key off.

I don't think I follow what you mean, but I suspect we're in agreement. I'd imagine the logical backtrace would be something like:

0: memcpy (or not, if its just inline load/stores)
1: core::intrinsics::compiler_move::<Foo> (???:???) (inlined)
2: where_I_move_Foo (movefoo.rs:123)

Originally I had been thinking that compiler_move/compiler_copy would be entirely synthetic, but it would probably be easier to define real functions in core/src/intrinsics.rs as placeholders (they would never actually be called), so there's some real source file/line to reference.

(All names up for discussion, of course.)

Vorpal · September 23, 2025, 8:21pm

Took a minute or find:

Vorpal · September 23, 2025, 8:27pm

jsgf:

don't think I follow what you mean, but I suspect we're in agreement. I'd imagine the logical backtrace would be something like:
0: memcpy (or not, if its just inline load/stores)
1: core::intrinsics::compiler_move::<Foo> (???:???) (inlined)
2: where_I_move_Foo (movefoo.rs:123)
Originally I had been thinking that compiler_move/compiler_copy would be entirely synthetic, but it would probably be easier to define real functions in core/src/intrinsics.rs as placeholders (they would never actually be called), so there's some real source file/line to reference.

(All names up for discussion, of course.)

It would be nice to attribute frame 1 to the right line in movefoo.rs, as there could be multiple copies of the same type in that function that starts at line 123. Otherwise it will be hard to tell in sampling profilers like perf what is going on.

jsgf · September 23, 2025, 9:10pm

I think that amounts to the same thing. movefoo.rs:123 would be the "call" site of the move, not where it's defined.

Vorpal · September 23, 2025, 9:27pm

I think you are right. I was mostly thinking about the large crowd of people (colleagues or otherwise...) who seem to think that (cycle based) flamegraphs are the be all and end all of a profiling, rather than a first starting point (as is actually the case).

Those people will have a hard time telling which copy of the same type in a function is the issue.

jsgf · October 1, 2025, 2:23am

I spent some time hacking on this, and have PR Implement profiling for compiler-generated move/copy operations by jsgf · Pull Request #147206 · rust-lang/rust · GitHub to try out.

It's implemented as a MIR transform. But instead of fiddling with filenames, it introduces a couple of new intrinsics compiler_move<T, const SIZE: usize>(...) (and compiler_copy) and makes the Operand::Move/Copy look like they've been inlined from there. So you can tell from a backtrace whether you're looking at a compiler-generated copy, and for what type and how big.

I haven't tested this very much yet (like actually try to do profiles) but I built rustc with it, and it has minimal impact on debuginfo size.

jsgf · October 17, 2025, 6:42pm

Update: I have a pretty solid implementation in Add -Zannotate-moves for profiler visibility of move/copy operations (codegen) by jsgf · Pull Request #147803 · rust-lang/rust · GitHub. I went back to a code-gen based approach just because the MIR-based one was getting too complex to handle a bunch of edge cases.

I also submitted an MCP since I hope this will ultimately become a user-facing stable feature: Move annotation for profiling compiler-generated moves and copies. · Issue #928 · rust-lang/compiler-team · GitHub

Here's a screenshot from a rustc flame-graph I generated, since it makes the point quite well.

Topic		Replies	Views
Redundant copy when move a variable language design	16	855	August 22, 2024
Pre-RFC: Copy-like trait for large types language design	22	2991	May 30, 2020
The way to see how memcpy and alloca introduced by move is optimized compiler	4	1023	October 12, 2023
Why does Rust generate 10x as much unoptimized assembly as GCC? compiler	23	5210	November 2, 2021
Missed stack manipulation elision? compiler	5	1221	March 25, 2019

Debug info for copy/move operations

Related topics