I know memcpy is introduced by move i.e.
pub fn loop_clone_string<'a>() -> Vec<String>{
let mut vector_string = vec![];
let mut origin = String::from("a");
repeat_outlined(&mut origin);
let copied = origin; // memcpy introduced without inlining
push_outlined(&mut vector_string, copied);
vector_string
}
#[inline(never)]
pub fn repeat_outlined(s: &mut String) {
*s = s.repeat(42);
}
#[inline(never)]
pub fn push_outlined(v: &mut Vec<String>, a: String) {
v.push(a);
}
is compiled to the code like
...
define void @example::loop_clone_string(ptr noalias nocapture noundef writeonly sret(%"alloc::vec::Vec<alloc::string::String>") dereferenceable(24) %_0) unnamed_addr #1 personality ptr @rust_eh_personality !dbg !284 {
start:
%copied = alloca %"alloc::string::String", align 8
%origin = alloca %"alloc::string::String", align 8
%vector_string = alloca %"alloc::vec::Vec<alloc::string::String>", align 8
...
bb1: ; preds = %bb7
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 8 dereferenceable(24) %copied, ptr noundef nonnull align 8 dereferenceable(24) %origin, i64 24, i1 false), !dbg !366
...
by rustc -Copt-level=3 --emit=llvm-ir
see Compiler Explorer
If the mutation before the assignment and the borrow after the assignment is inlined, memcpy and alloca for copied are removed by rustc optimizations.
I believe LLVM does not optimize this because the inlined LLVM IR with opt-level=0
couldn't be optimized by opt trunk
Compiler Explorer is the O2 opt result, and local O3 result also can't remove alloca and memcpy for copied.
Where are the optimizations done? Is there any way to track this optimization? I appreciate any help you can provide
(Edit: fix redundant info and embed the code to clarify
I also asked on the Rust forum. The way to see how memcpy and alloca introduced by move is optimized? - #3 by khei4 - help - The Rust Programming Language Forum