When we try to optimize build performance for our rust project. It appears that any internal transitive dependency change (not touching the exposed types, say just adding a comment) will invalidate the whole build cache.
It turned out the generated rmeta for the crate is different for any code change because it embeds the source file info:
/// Holds information about a rustc_span::SourceFile imported from another crate.
/// See `imported_source_file()` for more information.
The same idea applies to internal changes that don't affect the exposed types. Given all that, rmeta has little advantage compared to rlib, as any change will still propagate through the entire graph, and rlibs will still be regenerated. rmeta only gives an early start for the following build process.
My understanding is that to produce the final binary/dylib, we always need all transitive rlibs to present. But for intermediate build steps that generate rlibs, we just need type info (and perhaps some other details) to allow codegen, and all other stuff could be reassembled at the final step with rlibs.
A similar design is found in golang's compiler (Go at Google: Language Design in the Service of Software Engineering - The Go Programming Language) It avoids putting internal details to the intermediate build artifacts, to avoid transitive cache invalidation from changes that are not visible from outside.
The process is more automatic and even more efficient than in Plan 9 C, though: the data being read when evaluating the import is just "exported" data, not general program source code. The effect on overall compilation time can be huge, and scales well as the code base grows.
I'm wondering if there are any toggles/hacks that I can play with rmeta, to avoid leaking internal details to the generated rmeta files.