Build script capabilities

Thanks for starting this discussion! We’ve run into this issue in a few ways in the past.

One place where it’s currently a problem is when building Firefox with tup–tup needs to know about outputs from build commands that are inputs to other build commands (like generated source files or C sources compiled via cc in build scripts) so that it can build a correct dependency graph. Our current workaround is to have a hardcoded list of build script outputs which is not very maintainable. Ideally we’d have a way to express that in the crate definition so tup would have enough information to do the right thing. (I suspect that bazel and other sufficiently-opinionated build systems have this exact same problem.)

The other place where I’ve run into this is building with sccache–there’s a lower bound on how fast sccache can make a cargo build even if it’s able to get 100% cache hits because we have to build and run every build script. (sccache doesn’t currently cache the compilation of the build script itself because it doesn’t know how to properly cache linker invocations, but we could presumably fix that.) While implementing the original support for caching Rust compilation I did a lot of testing building Servo and I noticed that we spent a lot of time running build scripts that would do things like compile a bunch of C code into a static library only for the output to be completely irrelevant because sccache was able to fetch the rlib that depended on that static library from cache! If we had knowledge of the inputs and outputs of the build script we could feasibly avoid compiling it at all and simply produce the outputs from the cache.

I do worry that trying to cram the various things that build scripts currently do into a declarative manifest format will be really hard. I’ve wanted to do a survey of extant build scripts in published crates to determine what the common patterns are for a while now, but I hadn’t been able to figure out a methodology that didn’t involve either a lot of manual work or writing a ton of code that duplicated something like crater so I never quite got there. I think that longer-term cargo might need to provide an escape hatch in the form of something like Bazel’s starlark configuration language to allow crates to express somewhat complex requirements without the need to compile and run build scripts first (which produces a chicken-and-egg problem with external build systems).

3 Likes