Build.rs use cases and stories sought!

I’ve used build.rs to determine my project version by querying git (and then sets a variable that main.rs can incorporate).

Also, lalrpop’s canonical codegen process uses the build.rs mechanism.

1 Like

Why did you need that?

Why did I need which part? Getting git revision via build.rs? I certainly could have used a wrapper script around cargo to gather this information, but I opted to use build.rs because it’s nicely cross platform and I didn’t want to deviate from the normal cargo build command.

Yes, why did you need the git revision for your project? Not just why you used build.rs for it, but the underlying reason why you needed that git revision?

I like being able to run “myApp --version” to be able to see what exact version (git revision) I’m running. I could use the version field in Cargo.toml, but I don’t update that field with every commit.

3 Likes

Have a look at code needed to get OpenSSL working reliably across platforms:

Making that declarative seems pretty hard. Something has to know about quirks of OpenSSL in Homebrew, something has to know how to find it on FreeBSD, etc.

So where do you put these platform+package specific hacks? If you just tell cargo “I want OpenSSL, do your thing”, then all this mess would have to be built in into Cargo — every exception for every package on every platform.

Or you’d have to declaratively state “try pkg-config with this package, unless on macos, then try brew with that package, and fall back to that other package, and on freebsd search that dir, unless the header is busted or the version is wrong, then do…”, etc. You’d end up with a markup-based programming language.

Or you can say you don’t care about these hacks, and either the correct version is in /usr/lib or pkg-config or you get an error. Then we’re back to the pain of using dependencies in C, where every program has an INSTALL document with all the manual steps and workarounds for multiple platforms and the user has to fiddle with paths, config files, flags and env variables for every library.

2 Likes

Take a look at: https://people.gnome.org/~federico/blog/rust-build-scripts.html

For part of the inspiration for this thread.

When you’re just working in Rust, this is all well and fine to do in a build.rs.

But lets say you’re doing this as part of a Rust component of a larger software project. How does the larger software project know about these things build.rs does? How do you avoid duplicated work? How do you tell build.rs to do something different, say for a cross-compile? .

And thank you for the input. This is a really useful use case to consider long term.

I’ve seen that post. I’ve shown Frederico a few non-trivial examples, and his response was:

I think it’s critical to recognize that “find a C library” is not a simple problem to generalize.

The build.rs solution isn’t ideal, but a replacement has to have a room for platform & package-specific hacks, because even major popular libraries like libpng, libjpeg, libclang, openssl are “snowflakes” that need handling unique to their library.


BTW, for cases that are pure code generation (like perfect hash table generation) I agree that’s a solvable problem and could be made more integrated with other build systems. It’s the -sys crates that are a tough issue.

1 Like

In cases where you need a special-case handling for a package+platform combination, there can be two approaches:

  1. Either a -sys crate provides support for all platforms (horizontally cutting box in the image), or
  2. A platform provides support for all packages (vertically cutting box in the image).

So in case of Debian, they worry about all packages conforming to Debian's requirements and working together. Debian even insists on being the exclusive provider of "hacks" for all their packages.

In case of Cargo, we have each package worry about all platforms it supports. So the build.rs is a spaghetti code of all hacks for all platforms.

In case of systems that do their own packaging, "hey, let's have it declarative" is much simpler and quite desirable, because you align Cargo with what's already provided by the system (you take that green box in the image above).

But then there's Windows, where vcpkg barely exists, there's no real standard, and all libraries are a wild west. There's nothing that Cargo could align itself with. Cargo would have to be the provider of all hacks for all libraries.

MacOS is somewhere in between, where Homebrew provides a lot of packages and gives some uniformity, but it's not the same as Linux distros. For example, binaries dynamically linked with Homebrew libraries are not redistributable (i.e. they crash if run on another machine). Homebrew also tries to balance use of their own packages and Apple-provided libraries, which is another source of quirks and exceptions for everything that Apple has touched.

5 Likes

In a side project which I am yet to release, I use build.rs to generate protocol handling code from a DSL-ish description written in a macro argument. Or at least that’s the plan; this isn’t done yet, and maybe I can get away with writing this mostly in a macro body. For now, I only use it for converting identifers into str and &[u8] literals (with some fixups, like changing _ into '-'/b'-').

So in terms of broad use cases, I’m hearing two big ones: code generation and library linkage edge case logic.

So in terms of possible first steps, what about splitting out the code-gen functionality into its own script, something like gen.rs?

In Cargo.toml, what kind of parameters would you want to define? When to rerun gen.rs?

gen.rs would be constrained by the same rules that build.rs is. Should it be run before or after build.rs?

1 Like

Sounds like your problem would be more directly solved once RFC-2627 is implemented.

1 Like

Holy shit, that is exactly what I needed. Thanks!

Now to wait for impl... or help out on it.

Would it be feasible to standardize an binary calling convention similar to pkg-config that would find for example openssl?

You could have all your “find openssl” logic in a crate which produces a binary that takes the same args or works in a similar way as pkg-config.

Then it would be up to the overall build system to either use the real pkg-config or to use the rust crate.

That’s skipping the hardest part. If we had a bug-free universal tool, which already knows how to build/configure every library on every platform, then sure, we could standardize on it.

We already have pkg-config on some of the platforms, and it still requires workarounds in build.rs due to its limitations and bugs in package definitions (especially on macOS where it gives answers right for homebrew, but not necessarily right for Rust).

Also, pkg-config doesn't actually do everything that the build.rs script is doing. Even if it didn't need to work around configuration bugs, build.rs would still have to decide which features of the crate to enable or disable based on the version of openssl that's present. Something still has to generate the cargo:rustc-cfg=osslconf="(x)" definitions.

1 Like

That is fine, pkg-config has an argument for getting the version of the found dependency. Obviously the build.rs script or some sort of generated rust code is needed for rust specific things, which is totally fine. My problem with build.rs is that part of what it does makes it incredibly difficult to integrate into other build systems.

It doesn’t have to be universal. My point was that a tool like “openssl-config” could be compiled. It would be a rust project and it would do exactly the “find openssl” part of the build.rs. it would just follow the calling convention of pkgconf so that an external build system could use it. It would be a less opaque solution than have today

But who provides that openssl-config (and libfoo-config and libz-config-in-context-of-libpng-config)? For which platforms?

If Rust/Cargo, then it’s like giving maintenance of all crates-io sys crates to the Cargo team, effort comparable to making a new Linux distro, except for all platforms.

If package maintainers, then that’s exactly what build.rs compiles to, with a layer of indirection.

2 Likes

The main thing I dislike about build.rs is that it hides dependency information - cargo is constrained to building an executable then running it for its side-effects. The executable can express some limited dependency info in the form of “rerun if X changes”, but that’s pretty coarse.

build.rs is used for a few distinct things:

  1. generate Rust source code for use later in the build process - bindgen and parser generators are common examples
  2. capture some environment for version/build id (time, git hash, etc)
  3. build an external library for ffi use - libjpeg/openssl/etc.

The first case has well defined dependency information, and generates deterministic output. I’d assume its reasonably straightforward to make cargo understand this case (and maybe the existing mechanisms already do).

The second is inherently non-deterministic, since a timestamp will change all the time, and a git hash can change even if none of the code going into the executable has changed. Build systems generally have to special case these to avoid absurdities like “always rebuild because now has changed”.

The third is particularly awkward, since cargo doesn’t understand non-Rust dependencies, and nor should it. The approach I’ve suggested in the past is to split this into two parts: a declarative “I depend on openssl”, and some custom code which implements “here’s how to find/configure/build openssl”. This allows the two parts to be decoupled, so you can have different implementations for the second part for different environments, while giving cargo the abstracted information it needs.

So really, I think build.rs should be split up into at least 3 separate mechanisms to handle these cases (and maybe more that I’m overlooking now).