can generate the same assemby with only -ffast-math (-O3 -ffast-math -march=haswell).

fast-math is heavy unsafe option that's why it is not default on clang and gcc.
So I wonder is rustc pass some kind of fast-math option to llvm that allow it generate much better code then clang?

No optimization fanciness going on here, just powi being a special function that's different from C's powf. It calls llvm.powi.f32.i32 which is defined to use "an unspecified sequence of rounding operations".

I have no idea about original topic, but C++ std::pow is overload function (or template function depending on standard), see std::pow, std::powf, std::powl - cppreference.com. And it has variant with int as second argument.

But does it actually call something different? From the page you linked,

A set of overloads or a function template for all combinations of arguments of arithmetic type not covered by 1-3). If any argument has integral type, it is cast to double. If any argument is long double, then the return type Promoted is also long double, otherwise the return type is always double.

So that sounds to me like the pow(double, int) version just casts the int to a double and calls the double version.

No, the std::pow(double, int) overload was removed in C++11 and replaced with

A set of overloads or a function template for all combinations of arguments of arithmetic type not covered by 1-3). If any argument has integral type, it is cast to double. If any argument is long double, then the return type Promoted is also long double, otherwise the return type is always double.

It comes down to the fact that std::pow(double, double) and f64::powi(f64, i32) have different rounding behavior, though, I believe. It's worth noting that even x.powi(3) and x.powf(3.0) in Rust generate different ASM, despite having "the same" information to optimize with, due to the fact that powf is required to be ±½ULP whereas powi is allowed to have intermediate rounding error (i.e. be expanded to a vmul instruction fold).

(EDIT: gah, ninja'd again by @scottmcm for doing the extra legwork and checking f64::powf's codegen)

I tried build with -std=c++98 where there is overload with int as the second argument and clang / gcc produces the same code as rustc without enabling fast-math.