Build.rs use cases and stories sought!

Hello,

I am working on some FFI-related libraries, specific to windows, which leads to the requirement to use a build.rs file.

However, this is… clunky.

Getting system parameters is not easy. For example, accessing registry values can differ based on if the host is a 32 bit windows vs 64 bit windows. Finding required compilers is nicely abstracted away by the cc crate, but there’s no way to state this dependency in the Cargo.toml.

This is a hidden dependency, and it makes me unhappy.

So what I’m curious about is making things a little bit clearer to users of our crates in declarative ways, and to make build.rs files less clunky. If we iterate on this step by step, we may be able to eliminate them almost entirely.

So, how do you use build.rs files? Why? If you use it for codegen, why do you do that codegen? Is it possible for you to skip that codegen with better code design?

Thank you in advance for your contributions!

Can you please clarify what you mean here? How does the "build-dependencies" section not do what you want?

[build-dependencies]
cc = "1"

(Edit: Are you looking for a way for Cargo.toml to declare a dependency on a working compiler?)

I don’t think there is currently a way to say “you need this library and its header files installed” unless its a Rust library, in addition to a required compiler (say MSVC of a specific version).

So what I am doing is seeking out build.rs use cases and stories, so I can synthesize a first step RFC for working on this in Cargo.

2 Likes

In pest, we use build.rs to run bootstrap code generation in local development (bootstrap from crates-io) but not in a deployed format. This seems like a valid use of the buildscript to do something rather complicated.

Other good build.rs uses I’ve seen have been mainly for code generation that doesn’t use procedural macros, either because it was initially written before they were stable, or they just didn’t want that complexity.

The most declaritive-friendly build.rs are those of -sys crates to build and link some non-Rust library. I think best practice currently is to vendor the dependency but optionally link to a system resource if it exists? I don’t manage any -sys crates personally so I’m not exactly certain.

Here’s my own use case.

For mscorlib-sys, it depends on a common system dll mscorlib.dll on windows.

Only problem is that straight linking doesn’t work - none of the symbols would be available to the linker. The only solution I found was to build a C++ level wrapper which generates an mscorlib.tlb artifact, which is then used by the linker to actually find the necessary symbols.

Reflecting on this, it seems to me the core problem I’m having is that the linker doesn’t seem to handle mscorlib without some intermediary .tlb/.tlh file being generated.

So that’s one deficiency that can be addressed.

But I’d also like to just declare something like:

[link.dependencies.win64]
mscorlib = "native-tlb"

and then cargo handles the linking correctly.

And a base case would be

[link.dependencies]
mscoree = "native"

in Cargo.toml for common libraries included with the OS.

Here’s a link to the key piece of wrapping that gets done - the import statement is what generates the .tlb artifact used during linking: https://github.com/ZerothLaw/mscorlib-rs-sys/blob/master/src/c/mscorlib_wrapper.h

A common use of build.rs is to run bindgen, which is an official rust-lang project.

I’ve used it in the past for language version detection, which in turn drives the definition of various cfg flags to do conditional compilation.

I’ve also used it to generate, compile, and link resource files on Windows (specifically for producing a manifest, but I would have also used it for linking in an icon had I been bothered to do so).

I’ve used build.rs to determine my project version by querying git (and then sets a variable that main.rs can incorporate).

Also, lalrpop’s canonical codegen process uses the build.rs mechanism.

1 Like

Why did you need that?

Why did I need which part? Getting git revision via build.rs? I certainly could have used a wrapper script around cargo to gather this information, but I opted to use build.rs because it’s nicely cross platform and I didn’t want to deviate from the normal cargo build command.

Yes, why did you need the git revision for your project? Not just why you used build.rs for it, but the underlying reason why you needed that git revision?

I like being able to run “myApp --version” to be able to see what exact version (git revision) I’m running. I could use the version field in Cargo.toml, but I don’t update that field with every commit.

3 Likes

Have a look at code needed to get OpenSSL working reliably across platforms:

Making that declarative seems pretty hard. Something has to know about quirks of OpenSSL in Homebrew, something has to know how to find it on FreeBSD, etc.

So where do you put these platform+package specific hacks? If you just tell cargo “I want OpenSSL, do your thing”, then all this mess would have to be built in into Cargo — every exception for every package on every platform.

Or you’d have to declaratively state “try pkg-config with this package, unless on macos, then try brew with that package, and fall back to that other package, and on freebsd search that dir, unless the header is busted or the version is wrong, then do…”, etc. You’d end up with a markup-based programming language.

Or you can say you don’t care about these hacks, and either the correct version is in /usr/lib or pkg-config or you get an error. Then we’re back to the pain of using dependencies in C, where every program has an INSTALL document with all the manual steps and workarounds for multiple platforms and the user has to fiddle with paths, config files, flags and env variables for every library.

2 Likes

Take a look at: https://people.gnome.org/~federico/blog/rust-build-scripts.html

For part of the inspiration for this thread.

When you’re just working in Rust, this is all well and fine to do in a build.rs.

But lets say you’re doing this as part of a Rust component of a larger software project. How does the larger software project know about these things build.rs does? How do you avoid duplicated work? How do you tell build.rs to do something different, say for a cross-compile? .

And thank you for the input. This is a really useful use case to consider long term.

I’ve seen that post. I’ve shown Frederico a few non-trivial examples, and his response was:

I think it’s critical to recognize that “find a C library” is not a simple problem to generalize.

The build.rs solution isn’t ideal, but a replacement has to have a room for platform & package-specific hacks, because even major popular libraries like libpng, libjpeg, libclang, openssl are “snowflakes” that need handling unique to their library.


BTW, for cases that are pure code generation (like perfect hash table generation) I agree that’s a solvable problem and could be made more integrated with other build systems. It’s the -sys crates that are a tough issue.

1 Like

In cases where you need a special-case handling for a package+platform combination, there can be two approaches:

  1. Either a -sys crate provides support for all platforms (horizontally cutting box in the image), or
  2. A platform provides support for all packages (vertically cutting box in the image).

So in case of Debian, they worry about all packages conforming to Debian's requirements and working together. Debian even insists on being the exclusive provider of "hacks" for all their packages.

In case of Cargo, we have each package worry about all platforms it supports. So the build.rs is a spaghetti code of all hacks for all platforms.

In case of systems that do their own packaging, "hey, let's have it declarative" is much simpler and quite desirable, because you align Cargo with what's already provided by the system (you take that green box in the image above).

But then there's Windows, where vcpkg barely exists, there's no real standard, and all libraries are a wild west. There's nothing that Cargo could align itself with. Cargo would have to be the provider of all hacks for all libraries.

MacOS is somewhere in between, where Homebrew provides a lot of packages and gives some uniformity, but it's not the same as Linux distros. For example, binaries dynamically linked with Homebrew libraries are not redistributable (i.e. they crash if run on another machine). Homebrew also tries to balance use of their own packages and Apple-provided libraries, which is another source of quirks and exceptions for everything that Apple has touched.

5 Likes

In a side project which I am yet to release, I use build.rs to generate protocol handling code from a DSL-ish description written in a macro argument. Or at least that’s the plan; this isn’t done yet, and maybe I can get away with writing this mostly in a macro body. For now, I only use it for converting identifers into str and &[u8] literals (with some fixups, like changing _ into '-'/b'-').

So in terms of broad use cases, I’m hearing two big ones: code generation and library linkage edge case logic.

So in terms of possible first steps, what about splitting out the code-gen functionality into its own script, something like gen.rs?

In Cargo.toml, what kind of parameters would you want to define? When to rerun gen.rs?

gen.rs would be constrained by the same rules that build.rs is. Should it be run before or after build.rs?

1 Like

Sounds like your problem would be more directly solved once RFC-2627 is implemented.

1 Like