[pre-RFC] Generate "headers" for greater parallelism

This is similar to ideas we’ve had in the past to make rlib contain only mir and delay code generation until later. The later code generation is delayed the more opportunities there are to merge monomorphizations. Any changes here are heavily impacted by incremental compilation.

cc @michaelwoerister