I have AMD FX 8320. Every Rust program runs about 1.5-2x slower when I compile with target-cpu=native
, both on stable 1.65
and nightly
. Command rustc --print=target-cpus
says:
Available CPUs for this target:
native - Select the CPU of the current host (currently bdver2).
...
Did rustc
detect wrong CPU? Nope, cat /proc/cpuinfo
confirms bdver2
:
Summary
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 2
model name : AMD FX(tm)-8320 Eight-Core Processor
stepping : 0
microcode : 0x6000852
cpu MHz : 3296.080
cache size : 2048 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 16
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd ibpb vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass retbleed
bogomips : 7032.14
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
CPU family 21 is Piledriver, which is indeed Bulldozer version 2. Is something's wrong with the set of target features that rustc
gives to its LLVM backend? How do I debug this?