Relative paths in `cargo:rerun-if-changed` are not properly resolved in depfiles

TL;DR: Paths output using cargo:rerun-if-changed=... by build scripts appear unchanged in the depfiles of dependent crates, therefore using relative paths will cause external build tools to not be able to correctly resolve such paths or to wrongly resolve them. This will then cause external build tools to treat the output of cargo builds as always dirty, for example when a relative path is resolved to a non-existent path, possibly triggering cascading unnecessary rebuilds in the external build process. This includes the officially recommended cargo:rerun-if-changed=build.rs pattern that is used by many crates (e.g., on crates.io and github).

Edit: It seems that this is most likely a bug in cargo's depfile generation, as cargo's internal change detection interprets relative paths output by build scripts as relative to the build script's package root directory.

Edit 2: Submitted a bugreport for cargo: Relative cargo:rerun-if-changed paths are not resolved in dep-info files ยท Issue #9445 ยท rust-lang/cargo ยท GitHub

Detailed example and explanation of the issue


Consider a project that consists of two crates: dep (a library), and app (an application). dep has a build.rs, and app has a dependency on dep. The project also uses an external build tool that triggers the cargo build of app, and possibly uses the resulting binary; for this example we use ninja with a build.ninja file (we describe this file when it becomes relevant further down).

(Note: I put a bash script that reproduces the setup and output of this example later in this post; see below if you want to reproduce the issue on your system.)

Structure of our project:

.
โ”œโ”€โ”€ app
โ”‚   โ”œโ”€โ”€ src
โ”‚   โ”‚   โ””โ”€โ”€ main.rs
โ”‚   โ””โ”€โ”€ Cargo.toml
โ”œโ”€โ”€ dep
โ”‚   โ”œโ”€โ”€ src
โ”‚   โ”‚   โ””โ”€โ”€ lib.rs
โ”‚   โ”œโ”€โ”€ build.rs
โ”‚   โ””โ”€โ”€ Cargo.toml
โ””โ”€โ”€ build.ninja

app and dep have mostly the same contents as newly created through cargo new, but there are two additions.

app/Cargo.toml has a dependency on dep added:

app/Cargo.toml:

...

[dependencies]
dep = { ../dep }

dep/build.rs contains a small build script:

dep/build.rs:

fn main() {
  println!("cargo:rerun-if-changed=build.rs");
}

This pattern is officially recommended, and is usually used to only re-run the build script when the build script itself changes, as otherwise it is re-run whenever any source file in the project changes.

Finally, the project also contains a simple build.ninja, which sets up a basic ninja build:

build.ninja:

rule run-cargo
  depfile = $out.d
  command = cd $in && cargo build

build /cur/abs/dir/app/target/debug/app: run-cargo app

default /cur/abs/dir/app/target/debug/app

Quick explanation of this file:

This script first sets up a rule named run-cargo that uses the depfile created by cargo ($out.d will add .d to the target, i.e., turn target/debug/app into target/debup/app.d) to know when the project needs to be rebuilt. This "depfile" is generated by cargo as an output of the build process, made for consumption by external build tools. We then use this rule to build target/debug/app, and set this as the default target to build.

First build:


Let's see what happens when we invoke the build:

$ ninja --verbose -d explain
ninja explain: depfile '/cur/abs/dir/app/target/debug/app.d' is missing
[1/1] cd app && cargo build
   Compiling dep v0.1.0 (/cur/abs/dir/dep)
   Compiling app v0.1.0 (/cur/abs/dir/app)
    Finished dev [unoptimized + debuginfo] target(s) in 0.64s

(We told ninja to be --verbose about the commands it executes, and to -d explain why it (re-)builds certain outputs.)

What happened here is that ninja triggered the build of app because its depfile was missing. It then called cargo to build app, which in turn built dep, because it is specified as a dependency of app. This is correct and as expected.

Problem begins here

Now we come to the central problem outlined at the beginning. If we run ninja again at this point without changing anything, it should not run cargo again. This is the reason cargo outputs a depfile for external build tools. The depfile created in this case is app/target/debug/app.d, with the following content:

app/target/debug/app.d:

/cur/abs/dir/app/target/debug/app: /cur/abs/dir/app/src/main.rs /cur/abs/dir/dep/build.rs /cur/abs/dir/dep/src/lib.rs build.rs

This file basically says that the build /cur/abs/dir/app/target/debug/app is dependent on the given files after the :, namely /cur/abs/dir/app/src/main.rs, /cur/abs/dir/dep/build.rs, /cur/abs/dir/dep/src/lib.rs and, notably, build.rs. The absolute paths were added by cargo, but the last entry, which is just build.rs, originates from the output of dep's build script, which wrote cargo:rerun-if-changed=build.rs.

Assume for a second that our project has multiple, deep, transitive dependencies: Would you know which dependency this build.rs is relative to? The same goes for all relative paths output by build scripts.

This is the central problem with cargo:rerun-if-changed= with relative paths.

Let's see what happens when we perform a rebuild at this point, to see whether this is a problem in practice:

Second build (no changes):

$ ninja --verbose -d explain
ninja explain: output build.rs of phony edge with no inputs doesn't exist
ninja explain: build.rs is dirty
[1/1] cd app && cargo build
    Finished dev [unoptimized + debuginfo] target(s) in 0.00

ninja has no chance to know which build.rs is meant either, and resolves the path in relation to the current directory, so checks for a /cur/abs/dir/build.rs, which it doesn't find. As it thus can't check whether the build of app is up-to-date, it has to rerun it to make sure the build is not out-of-date.

If the external build is more complex, this will trigger a cascading rebuild of everything that depends on the output of app on every invocation of the build, no matter whether it changed or not.

Bash script to reproduce structure and output locally:

Click to unfold
#!/usr/bin/env bash
set -euo pipefail

cargo new --lib dep
cat << EOF > "dep/build.rs"
fn main() {
  println!("cargo:rerun-if-changed=build.rs");
} 
EOF

cargo new --bin app > /dev/null
echo 'dep = { path = "../dep" }' >> app/Cargo.toml

cat << EOF > "build.ninja"
rule run-cargo
  depfile = \$out.d
  command = cd \$in && cargo build

build $(pwd)/app/target/debug/app: run-cargo app

default $(pwd)/app/target/debug/app
EOF

echo
echo "first build:"
ninja --verbose -d explain 
echo

echo "created depfile app/target/debug/app.d:"
cat app/target/debug/app.d
echo

echo "second build:"
ninja --verbose -d explain 

Real world occurrences and related issues


While investigating this interaction I discovered some related findings of it "in the wild"/on github:

  • rustfmt: This PR solves a related problem where the build script wants to be rerun if .git/HEAD changes. However, in this case the build is done purely through cargo (so without an external build tool), and, interestingly, cargo seems to interpret the path relative to the build script's package in its own rebuild mechanic. So it only outputs the non-reproducible version of the relative path to the depfile but uses a different one internally.
  • cxx (@dtolnay): The build.rs of cxx outputs two paths that refer to its own source files as rerun-if-changed conditions (link). If we add a dependency on cxx to our example project (i.e., to app/Cargo.toml), we get the additional output:
    $ ninja --verbose -d explain
    ...
    ninja explain: output include/cxx.h of phony edge with no inputs doesn't exist
    ninja explain: include/cxx.h is dirty
    ninja explain: output src/cxx.cc of phony edge with no inputs doesn't exist
    ninja explain: src/cxx.cc is dirty
    ...
    $ cat app/target/debug/app.d
    /cur/abs/dir/app/target/debug/app: ... include/cxx.h src/cxx.cc
    
    As, again, ninja cannot correctly resolve these relative paths. So this problem definitely occurs with existing crates on github.
  • A search for cargo:rerun-if-changed=build.rs on github yields around 2630 commits, showing that this might be impacting a lot of existing crates. This is just a very rough estimate and probably quite inaccurate (I don't think it's easily possible to crawl all rust projects on github, which is why I don't have more concrete numbers for this).
  • The official documentation that recommends using cargo:rerun-if-changed=build.rs also is an instance of this problem, and probably also an indicator of the spread of this issue. It also shows how subtle this interaction is, as it made its way into the official documentation and stayed there unnoticed for quite some time (it's still present in the nighly version of the documentation at the time of writing).

Side notes

  • While dependencies that are specified from git or from crates.io do not cause cargo to automatically add /cur/abs/dir/dep/*.rs-dependencies to depfiles, it still adds all paths output by build scripts to the depfile(s) of the dependent crate being built. So this issue does not only exist with path-dependencies (as can also be seen from the example using cxx).
  • cargo can be told to make the paths in depfiles relative, for external build tools that want relative paths. This strips a given prefix from all paths written to the depfile (i.e., also those output by build-scripts), but doesn't add anything to paths that don't start with the given prefix. I.e., it can't help with making relative paths output by build scripts absolute (or relative to the project root, which is not even necessarily correct).

Generality of the issue


One important question is whether this is only an issue with specifically ninja, or whether it is more general. The fact that it seems impossible to known which directory a relative path such as build.rs refers to makes this a general issue in my opinion, as the depfile-format is made to be consumed by other external build tools as well (e.g., make and friends).

Build tools that rely not on timestamps to know when something should be rebuilt, but on, e.g., hashes of file contents, might be unaffected by this, but I would argue that build tools relying on timestamps are relevant and should be supported if reasonably possible, as they are still the prevalent kind (I think).

Another question is whether supporting external build tools this far is desired. I would argue that support for external build tooling, especially subtle issues such as this, make it easier for rust to spread further and to integrate better with existing projects that consider using rust.

Possible solutions


To reduce impact on users and support this as easily as possible, an ideal solution would be a change to cargo/other tooling that is transparent for users and fixes the issue without manual intervention. However, as this would change the current behavior and thus be a breaking change, it might not be desired.

  1. One transparent, but breaking, change would be to make all (non-absolute) paths output by build scripts absolute (i.e., by de-facto prefixing with $CARGO_MANIFEST_DIR). This seems like it would also make handling of such relative paths consistent between cargo's internal change-detection mechanism and between what is written to depfiles.

    • This raises the question of "do we want to allow absolute paths that point outside of the source of the package that contains the currently running build script?". Currently, cargo allows such dependencies and will re-run the build script if it depends on, e.g., ~/some-file and ~/some-file is changed. So it seems like this is an unrelated question that does not hinder this solution.

    • Another consequence would be that this also causes these absolute paths to contain parts such as ~/.cargo/registry/src/..., which cargo seems to intentionally not emit to depfiles currently. As a solution relative paths could also simply be dropped from the output of build-scripts when running for git/crates.io dependencies, and would be made absolute (as described above) for path-dependencies.

    • Also, a question is whether the current behavior is (intentionally) being used or worked around by existing projects (relevant xkcd).

    These would all be breaking changes to the behavior of cargo w.r.t. the content of depfiles.

  2. A non-breaking option would be to keep behavior as-is, and raise awareness for the current situation, offering users a new opt-in way to achieve the same behavior but without breaking depfiles. One possible way would be:

    • Add a new option, i.e., cargo:default-rerun=true/false, that allows toggling the default behavior of rerunning the build script for every source change. An option like this is already being desired independent of this interaction, so this might solve other use-cases as well: Rerun-if-changed without disabling other heuristics as a side-effect ยท Issue #4587 ยท rust-lang/cargo ยท GitHub.
    • Changing the documentation to state that absolute paths should be output by build scripts, or to use the new option.
    • Adding a clippy-lint that warns when relative paths are output from build scripts. Clippy doesn't seem to have such a lint yet.
    • A further step would be adding a compiler warning that complains when a relative path is used, and either recommend using the new option or an absolute path (if this is desired).
    • Automatic issue-creation for rust projects on github that currently contain a cargo:rerun-if-changed=build.rs, or something similar (could get arbitrarily smart here probably).

If a breaking change is okay, then this can (simply) be resolved within cargo it seems. If a breaking change is not acceptable, it seems a solution that raises awareness and changes existing projects is better.

3 Likes

Cargo documentation shows this instruction used only with relative paths, and explicitly recommends cargo:rerun-if-changed=build.rs, so you have to assume this is correct usage.

It'd be unfair to ask all build.rs users to change it, especially that std::fs::canonicalize is breaking on Windows, so the most straightforward change working around a problem one tool is going to expose problems in other tools.

Cargo itself has no problems interpreting this.

So I think it's a bug in generation of the dep files. These outputs can't be put in dep as-is. They have to be first resolved relative to the cargo manifest dir of the build script that emitted them โ€” or perhaps the dep is fine, but the dir it's relative to is lost when it's given to ninja? I haven't looked at the dep files closely.

4 Likes

Thanks for taking a look at the issue!

The problem is that the dep files are generated by cargo. I literally do not control the implementation of the depfile generation. It is an output by cargo explicitly for consumption by external build tools (of which ninja is an example but not the only one; the syntax is also made for tools like make etc.).

Quoting from the official cargo docs:

Next to each compiled artifact is a file called a "dep info" file with a .d suffix. This file is a Makefile-like syntax that indicates all of the file dependencies required to rebuild the artifact. These are intended to be used with external build systems so that they can detect if Cargo needs to be re-executed. The paths in the file are absolute by default. See the build.dep-info-basedir config option to use relative paths.

Note that cargo very explicitly writes absolute paths to the depfile. This sounds to me like cargo wants the file to be usable without reconstructing anything. It also makes sense given that the file is meant to be consumed by external tools, which don't have an intricate understanding of the build process.

The last part (mentioning build.dep-info-basedir) is related but doesn't help, as it only turns absolute paths into relative paths, but doesn't turn relative paths (when they are written by build scripts) into absolute paths. I mention this in the original post here.

But I agree, I also think this is a bug in the generation of the depfiles. Just that this is done by cargo, so I think this is actually wrong behavior by cargo.

Cargo itself uses a different mechanism for determining whether it needs to rebuild something, and it outputs the depfile only for external build tools. strace cargo shows that cargo prints Finished with the build before it even touches the depfile. It literally only outputs the file for other tools to use. I think cargo uses a "fingerprint"-based mechanism for detecting recompilation, probably related to incremental compilation, so much more finegrained. This mechanism that cargo uses internally seems to interpret relative paths "correctly" in relation to the cargo manifest dir.

Yeah, I'm basically saying that using the feature in the recommended, officially documented way causes cargo to output wrong/incomplete depfiles. I think this is a bug in cargo, and cargo never intended for the depfiles it generates to diverge from its internal mechanism. However, I also think that the two diverged, that this is a bug, and that it probably requires a (slightly breaking) change to cargo (or changes to all creates that use relative paths).

Seems just a bug in the cargo depfile generation (not unlikely since almost no one uses external build tools), not sure why you think it would be a breaking change to fix it.

2 Likes

Solely from the title of this thread, it sounds like you are advocating for the "changes to all crates that use relative paths" option. As this is in fact the current recommended usage, you'll have an uphill battle convincing people that it's the right solution.

You might get better discussion if you retitle the thread to more neutrally represent the problem without implying a particular solution. Maybe something along the lines of "Relative paths in build.rs are not reflected in depfiles".

3 Likes

Yeah, now that I consider it this way, it seems to be (just) a bug in cargo. Initially, I assumed that cargo intended to just print relative paths to depfiles without resolving them and that this was intended behavior. But now that cargo seems to already use another mechanism internally that interprets the relative paths in their intended way, it seems that it might just be a bug in cargo.