Proposal: make cargo output dep-info


#1

In Fuchsia, we are integrating Rust into a larger (ninja-based) build system by having that system invoke cargo when necessary. There are other ways to do it; for example, I believe Bazel addresses the problem by invoking rustc directly and not relying on cargo.

A major challenge is defining “when necessary.” We could invoke cargo every time, but that cargo null builds are not exactly lightning fast, and slowing down the null build for the larger project would be bad. Especially if the number of Rust binaries were to increase, it could easily go into the unacceptable range.

One time-tested way to solve this issue is a depfile. Basically, when writing an artifact (say, binary), it also writes a file (traditionally with a .d suffix) that has the path to the artifact, a colon, and a list of source files that were used as inputs. The build system only runs the build rule when any of these files have been changed since the last build. Note that rustc has long supported --emit dep-info, which works in exactly this way, and this is how cargo knows what it needs to rebuild.

Ninja has explicit support for depfiles (linked above), and certainly GN lets individual build rules use this mechanism (doc). Without having done a lot of research, I believe it’s a fairly general mechanism and likely to be supported broadly. We also use this mechanism to integrate with Go’s build tool, using a purpose-built tool called godepfile.

I’ve written a prototype, using cargo as a library, that aggregates the depfiles internally produced as part of a cargo build (these are stored in the .fingerprint directory as dep-* files). However, none of this is part of cargo’s public interface, and indeed, the tool that I developed for nightly doesn’t run on stable because various details have changed. The prototype has some other limitations (like not incorporating build script inputs), but those can be fixed.

My first preference would be for such a mechanism to be included into cargo, so the depfile could be produced at the same time as the actual build artifact. This would prevent the dependency analysis from getting out of sync from the actual artifact. However, adding a feature to cargo is an additional burden.

I’m definitely willing to patch cargo, based on my initial prototype. But I’d like to gauge community interest first. It seems much more worth doing if there are other build system integrations that will benefit. The first cut at the patch will have some limitations, so maybe it should be considered an experimental feature.

One design question is the scope of how far the dependencies reach - should they extend into dependent libraries at all? The decision I’ve made so far is that they cover files in the crate, but not external crates. For crates.io and git dependencies, we would count Cargo.lock as an input to the build rule. For libraries specified using path dependencies, those would need to be added explicitly in the parent build language. This seems to me a good compromise between simplicity and ergonomics, but there are other ways to think about it.

Comments and suggestions welcome.


#2

I already said this on IRC, but we’d like this for the Firefox build as well. We currently invoke cargo unconditionally for every Firefox build which is not great. It’s not a long pole by any means, but as we make our builds faster the overhead will show up.

In general, I think better cargo interop with other build systems is important and will help drive Rust adoption. The easier it is to integrate Rust code into an existing C project the more likely it is that people will do it.


#3

Thanks for the post @raphlinus! I’m definitely in favor of such a feature in Cargo as I think it’s crucial for many build system integration scenarios.

Are you thinking that basically alongside target/debug/foo we’ll just emit a target/debug/foo.d to list all dependencies? Similarly for other rlib/dylib/etc artifacts. If so, I do think that we’ll want to include upstream dependencies in that dependency file so build systems can track everything. We’d probably omit any dependencies from the registry or git repos though as they’re “read only”, but all path dependencies should likely be included.

I personally take the performance of a noop build in Cargo quite seriously, and if it’s slow then that’s a serious bug in Cargo! If you can isolate performance problems or have examples I’d love to dig into them and see what we can optimize.


#4

Additionally, cargo should output where it put the actual artifacts.


#5

@jethrogb Better for cargo to output that, or to be able to specify where it should go? I think ultimately the goal in integrating with a build system is the latter; in our integration, we’ll guess the path, then copy it into the final out location. But perhaps that’s a separate issue.

@alexcrichton I’ll seriously consider adding path deps. I can see how it would make sense. But the first version I upload (I’m working on the patch) won’t, just because of the implementation complexity.

Preparing my CL, I’m thinking there should be a tracking issue on cargo. Make sense?


#6

Sounds good to me!


#7

Cargo creating a build plan which another tool executes is exactly what https://github.com/rust-lang/cargo/issues/1997 needs too.


#8

To update the thread, this was merged as https://github.com/rust-lang/cargo/pull/3557. There are some loose ends mentioned in the discussion for that PR, which I should make sure are properly documented in an issue.