Cargo support for build-script filters, lints, and link-path providers


#1

From https://github.com/alexcrichton/pkg-config-rs/issues/11#issuecomment-270243763

In order to tackle the following issues:

  • Non-deterministic ordering of link paths provided to rustc (link error reproducibility can be less than 5%)
  • System folders appear in link paths
  • Multiple candidates per artifact found in user link-paths
  • Separation of declaration crates (*-sys) from reusable C artifact providers (so third-parties can bring support for new package managers and operating systems). For example, I’d like to make a provider that uses Conan.io to provide C binaries - for all libc/libc++ ABIs, major platforms and operating systems - and for specific library versions.
  • Build script overrides do not appear to address the risk of silently linking against the wrong binaries and risking future heap or stack corruption (if one misses a target or library in .cargo/config).
  • Build script overrides require ahead-of-time preparation for all potential targets.

I’d like to propose a method for compile-time filtering of build script output.

  • This does not address the need for declaration crates to expose a list of ABI versions (as arbitrary prefixed strings) which they are compatible with. This does not appear to be addressed in current usage. If standardized, more intelligent package management integration would be possible. This would be possible using STDOUT and the existing key-value convention.
  • This does not eliminate duplication of work for those crates that actually invoke compilation during build.rs (is this common?)

Interface

An executable receives build script output via STDIN, and produces modified results to STDOUT. A non-zero exit code fails the build. All env vars are provided to the executable, including the TARGET, OPT_LEVEL, DEBUG, and OUT_DIR.

Multiple filters can be chained, to allow for separation of responsibility.

Regardless of invocation point, cargo should respect the values for rustc-link-lib and rustc-link-search post-filtering, and ignore unfiltered values. It should also output warnings.

If the filter is being invoked at link time, with concatenated output from multiple crates, the CARGO_FILTER_BEFORE_LINK env var should be set to '1`, and all non-crate-specific env vars made available.

The additional, the following information should be provided via environment variables in some form:

  • An array of “links” values from all crates
  • An array of crate name and version pairs (for all crates). If API incompatibility is in bindings crate (but not declaration crate), this could be relevant.
  • An array of crate name, version, and links value triples.

If the filter is being invoked before compilation - for a single crate’s output (or lack thereof), then the CARGO_FILTER_BEFORE_COMPILE env var should be set to '1`, and all build and compile-time env vars should be provided. This would allow modification of rustc-cfg and other keys.

Enables

  • Linting and reporting of link paths
  • Testing of link-order determinism (for potential improvements).
  • Detection of duplicate link candidates
  • On-demand production of artifacts (and their link arguments) for only the required target triples.
  • Code reuse - as a build-dependency.

Configuration

  • It should be possible to disable filters and build scripts for any dependency - per target, or regardless of target triple.
  • Filters on a crate should be an ordered list. Perhaps a unification of build scripts and filters into an ordered list of “build steps”. I.e, build="build.rs install-openssl lint-link-candidates build/custom-filter.rs"
  • Link-time filters are specified separately - i.e. link-filters=..

Thoughts?


#2

If stdin is problematic, we could provide input as arguments to the filter - paths to the recorded build output for each crate in question?