Arguments::as_str
is a similar "just for the purpose of optimization" API, so there is some precedent for adding shortcuts for the "essentially just a string" case.
The main reason this is difficult AIUI is the dynamicism involved. Even if we ignore potential difficulties around reference validity guarantees[1], while it could be somewhat straightforward to replace fn fmt(&self, &mut Formatter<'_>) -> Result
with fn fmt(&self, &mut dyn Write + '_) -> Result
, the devirtualization to remove the dyn
dispatch (required to actually DCE the fmt
machinery) is much less straightforward.
That said, an MIR pass which attempts some amount of devirtualization would be an interesting project. AIUI most MIR opts have been focused on reducing the amount of IR passed to LLVM, and devirtualization would usually move in the other direction, but perhaps there's a heuristic that rustc could use that could remain a net positive?
After current MIR inlining, calling <&String as Display>::fmt(s, f)
or <str as Display>::fmt(s, f)
with s: &&String, f: &mut Formatter
look the same, modulo debug information. A call to <&String as ToString>::to_string
is just a call, whereas <str as ToString>::to_string
is fully inlined. (playground links)
ToString::to_string
is already marked #[inline]
with a note that while unconventional for a generic impl, it has significant perf impact (ref: #74852). <&_ as Display>::fmt
does not have such an #[inline]
annotation; perhaps adding it would enable <&String as ToString>::to_string
to be inlined?
I recall seeing that the heuristic for auto-#[inline]
is roughly that no MIR call statements exist in the optimized MIR. <str as ToString>::to_string
obviously does include call ops (into allocation, as well as Vec::deref
, interestingly[2]).
Subobservation: Vec::deref
isn't known to not unwind. I would've hoped it'd just've been that MIR always includes unwind edges, but Vec::deref
has -> [unwind continue]
where a #[rustc_nounwind]
call has -> [unwind unreachable]
. An MIR pass/opt to record cross-crate functions known to never unwind could potentially unblock some hidden optimizations, if not at the LLVM level, then at least at the MIR level.
Edit to add: reported Vec::deref
MIR inlining regression as an issue
I'm not sure exactly how relevant it is here, but it can be difficult to automatically optimize
fn(&Scalar)
intofn(Scalar)
because while it's a validity requirement for the reference to be dereferenceable to sufficient bytes, there's no validity requirement for the bytes to be a valid instance of the scalar (currently; disclaimer: undecided, my own non-normative recollection, etc) even if we derive proof that the address is irrelevant. ↩︎And this is despite the function being marked as
#[inline]
. Here it is open coded (shows there aren't any reachable unwinding edges). Gut guess: the call tostd::slice::from_raw_parts::precondition_check
is blocking inliningJustification: on stable, it inlines and doesn't include that call, making it a single straightline basic block, whereas it does include the UB check on beta and doesn't inline there. ↩︎