Static binary support in rust

This topic is about supporting the creation of binaries that do the following:

$ ldd binary
        not a dynamic executable

I managed to do this after reading this post, but it requires:

  • compiling all crate deps with cargo build -p
  • using -Z print-link-args and --emit-obj argument with the new cargo rustc comand to a) get a .o file and b) get a linker command
  • hand editing the linker command line to (among other things) add -static, change -lgcc_s to -lgcc_eh and remove -pie

This resulted in a binary (with glibc statically linked) that I can run anywhere, even without libc installed. Static linking is one of the selling points of go and I think making this easier is worth pursuing with rust. Compiling on old distros for maximum compatibility isn’t fun.

In terms of previous movements on this, an issue was raised referencing this comment on linking glibc. Another issue was raised with the possibility of supporting alternative C libraries.

So it seems I may not understand the blocker. Given I do have a working statically linked binary, why is glibc not suitable for static linking? There are some issues (see the performance issue outlined in my first link and this comment chain) but it does result in functional output.

3 Likes

I’m no expert on this, but experimental support for using MUSL instead of glibc was recently added.

3 Likes

That’s pretty interesting, thanks!

It looks like that PR allows you to use musl, under the condition that you configure the rust compiler with that target. Having seen the light of rustup, I am not overeager to return to the (long) dark days of compiling rust myself.

Now, I too am no expert in linking, but would it not be possible to offer musl as a(n approximate) swappable alternative to glibc? I’ve just compiled musl and upstream libunwind, added some shims for missing symbols, reordered link arguments and have a toy rust program running - no messing with rust itself needed.

Unfortunately, I now come full circle to my original suggestion of linking glibc into my static binary - when trying to link python, I observed a lot of things like undefined reference to '__isinf', because python was compiled with gcc and musl doesn’t have these as weak symbols - I would need to recompile python or add more shims. Given this actually works fine when statically linking glibc (per original post), this seems like lots of effort for not much gain.

The current story for totally statically linked executables in Rust is probably currently best done through MUSL support (as you've discovered). I have not personally dabbled in statically linking glibc and its dependencies, although I would imagine that it's at least possible somehow!

I agree! I hope to work with @brson and set up a dist bot which uploads MUSL nightlies and also modify rustup itself to be able to download the libraries for a different target (e.g. a MUSL target). Then all you'd need to do is cargo build --target x86_64-unknown-linux-musl (and pray that the native dependencies work out ok)!

Reading back over my post, I can see that I wasn't clear:

Unfortunately, I now come full circle to my original suggestion of linking glibc into my static binary - when trying to link python, I observed a lot of things like undefined reference to '__isinf', because python was compiled with gcc and musl doesn't have these as weak symbols - I would need to recompile python or add more shims. Given this actually works fine when statically linking glibc (per original post), this seems like lots of effort for not much gain.

'gcc' above should say 'glibc'. It was late when I wrote that :expressionless: The difficulty linking python above is encountered when linking with musl - glibc *.h files reference symbols (like __isinf) that musl does not define, so libraries compiled with glibc may not be linkable when using musl. This is a problem for other people as well. In a particularly cruel touch, there's actually a glibc-only symbol in rust (liballoc-4e7c5e5c.rlib) itself - __rawmemchr. Fortunately (per my reference to 'shims' above) you can work around it by adding your own definition:

#include <string.h>
void *__rawmemchr(const void *s, int c) {
    return memchr((s),(size_t)-1,(c));
}

In summary, I think musl is useful if you have complete control over your dependencies, but as soon as you want to link to existing libraries you may need to statically link glibc, so I'd really like to see this supported as a 'first-class' alternative. I am, of course, willing to help, but I'd need some guidance on how you'd want this implemented - rustc -C static-libc?

Shouldn’t these discussions take into account other OSs that don’t use glibc?

Not sure, I don’t have any experience of rust on other OSs. Full static binaries aren’t possible on Windows I think? So I’m not sure this specific topic would address it (though it is probably related). I assume a musl Linux (e.g. Alpine) would be addressed by the PR linked above and I don’t even know what libcs BSD/solaris use.

Do you have a specific OS in mind you want to see better static binary support on?

[quote=“aidanhs, post:7, topic:2011”] Do you have a specific OS in mind you want to see better static binary support on? [/quote]It’s not so much about static builds for me as an interest in Rust of FreeBSD. But static builds issue seems intertwined with glibc problems and it may be good to acknowledge the existence of systems where glibc is irrelevant.

Even with glibc, however, once you statically link it you've basically forced your hand in statically linking everything which may not always be trivial to do. I don't think there's much the compiler itself can do to help you out with this (other than perhaps ferry special arguments along to the linker), it's largely on the burden of the builder to ensure that all native dependencies are available in static form (including all transitive dependencies).

This is the idea behind rustc --target x86_64-unknown-linux-musl (e.g. a whole separate compile mode for this special form of linkage), but I think the best way to help out here would be to forge ahead documenting what you run into!

Maybe it's clearer if I say 'system libc' rather than 'glibc'. For example, I can apt-get a package to get libpython2.7.a (compiled with 'system libc'). This can then be statically linked, assuming rust supports statically linking 'system libc'. But if rust only supports statically linking a 'non system libc', I am forced to recompile my libs if there are incompatibilities between 'system libc' and 'non system libc'. True, it doesn't help with libraries that are just nontrivial to statically link in general. But it does help with a number of common cases. I now realise this also relates to what @gkoz was saying - regardless of what the 'system libc' actually is, it'd be nice to support static linking with it. My glibc obsession is just an side-effect of me working on Ubuntu.

Sure! Where? I could start a page in the rust book (under nightly?) talking about static binaries, or a 'going off-piste' in cargo? Do you have an 'experimental stuff' area anywhere? A gist/github repo is also possible...but I'm not so fond because I really do think there are going to be people wanting to do this and official docs stand some chance of being updated as musl (at least) moves along.

Note that for most distros, static linking against 'system libc' is an unsupported feature. See the glibc FAQ. Most distros will have '--disable-static-nss' turned on.

Linux is probably the only OS that supports static linking of libc (not with glibc though), as it has a stable kernel interface. Neither Windows or Mac supports it. I believe solaris deesn't either.

That said, you can modify librustc_back to add a new platform, say x86_64-unknown-linux-static_glibc, and tweak the linker arguments that rustc passes to the linker, and maybe tweak some other places in the compiler (basically do what the pull request for experimental MUSL support does). It's probably not worth the effort though, considering that glibc officially recommends against static linking.

Linking against different libraries than the module was compiled for is never a good idea. If you want to do a static build, you should compile everything that will go in it. Those __isinf and __rawmemchr and similar references are there (usually) because of macros and inline functions in glibc's headers, not because the source code would mention them. They will go away if you recompile against correct headers.

That is illegal for non-GPL code; and for GPL code you don't have a reason to do it as the user can always recompile it if they have ancient libc in system. LGPL only allows linking non-(L)GPL code dynamically.

Full static binaries are definitely possible on Windows. Well, you still link dynamically kernel.dll and ntdll.dll, but those are how the Win32 API is defined. They should be considered system calls, not libraries.

However, on Windows, code must be compiled differently depending on whether it will be linked statically or dynamically. So you need to make or get appropriate builds of all your dependencies (many ship them, but some, e.g. Python, don't).

Python uses dlopen to load modules and this will cause serious problems in static builds due to one definition rule. On ELF-based systems (like Linux) it will work provided that all modules are linked against the same dependencies. On Windows it won't work, ever. On Windows when you use any dlls, you must use dynamic runtime.

And for a good reason, too. NSS (name service switch) is a security sensitive piece of code. Whenever it has a security issue fixed, any code that linked to it statically would have to be recompiled. By only permitting dynamic linking the distributions ensure that any fix applies to the whole system.

For the same reason, most distributions don't accept statically linked packages.

On Windows, "libc" is msvcrnn and that can be linked statically. The kernel.dll and ntdll.dll are the kernel interface.

As for MacOS I don't know, but iOS does not support shared libraries at all.

1 Like

I've seen a number of links to that page in association with the word 'unsupported', but it does not actually say that:

Unless I've missed something, this is not a declaration of a lack of support. This is saying that if you happen to use something that needs NSS, then it's not recommended to truly statically link NSS (which, as you say, is not easy to do anyway due to distro configuration) because of possible inconsistencies. While these inconsistencies are not spelled out, I assume it relates to not using nsswitch.conf. Which musl doesn't do anyway(?).

For a subset of binaries!

I know all this - the point is that if you're given static libraries compiled with the system libc headers, you don't need to recompile (as long as rust supports statically linking the system libc).

Aside from this being untrue (faq), I feel licensing issues are outside the scope of this discussion.

Unless you're never going to use dlopen because the scripts you embed are controlled by you. It's just an example!

FreeBSD strives (with some exceptions) to keep the compatibility for static binaries built years ago (the FREEBSD_COMPAT* kernel options are enabled by default).

In terms of documenting what's there already it'd probably be fine to start growing a "platform specific" section of the docs where the musl page would document how to use it, what it's used for, and common gotchas, etc. There could also be a section for static glibc binaries which could grow over time as well.

Just for completeness, when it comes to OS X: statically linking libc or coding to the kernel directly is not supported and strongly recommended against, and e.g. Go programs dynamically link it. (However, I don’t know of any time when backwards compatibility in the syscall interface was ever actually broken.)

Just to throw this out there: How would folks feel about making musl the default choice on Linux (and bundling it)? People seem to really like the ability to copy Golang binaries around from system to system, and this would give us that functionality.

We would probably have to keep glibc support around for the distros that don’t want us doing that, but it would be a nice win for the out-of-the-box experience.

4 Likes

I’d personally love it. Are there any downsides?

One downside is that musl does not support NSS. My position on NSS is that it is a useful functionality, but it is better implemented as a server running on localhost that speaks DNS and tries multiple backends.

Maybe writing a DNS server was difficult in the past. These days, you can write one in tens of lines using things like node-dns.

I would want to investigate a bit more to flesh out some more details, but using MUSL by default runs the risk of reducing the interoperability of Rust by default. If a crate links to a system library then there are two C libraries in play, the one the system library linked to (probably glibc) and the one that the Rust code is linked to (MUSL). Functions which have global state in the C library will have distinct global state, which can possibly lead to some surprises.

Additionally, MUSL does not support dlopen (as it’s a static binary), so the compiler itself could not be built against MUSL (consequently requiring that plugins do not compile against MUSL, consequently requiring that we’d ship two versions of the target libraries).

There’s also the problem of compiling new native dependencies as part of a build script for a crate. In theory the compiled code should be compiled against MUSL, but most systems do not have the MUSL header files by default (or the musl-gcc wrapper).

All-in-all I would love to ship official builds of MUSL and make it super easy to acquire, but before making it the default for Rust I’d want to investigate how easy it is to work with the existing ecosystem and how many crates work on a MUSL target (wrt native dependencies)

3 Likes