Thoughts on Rust stdlib and C interfacing

Yes, exactly. Most of the things that lrs is doing are very good. The problem is that having two standard libraries (three, actually, because Rust also has the core subset of std) is bad. Also, I get the impression that there are some uncomfortable politics between the Rust team and the lrs team, but maybe I'm reading too much into it. Regardless, I hope that Rust adopts the approach to syscalls and avoiding libc on Linux that lrs does.

Maybe you think the above is overkill, but it is basically what one has to do to use seccomp-bpf most effectively.

seccomp-bpf is a very good point I hadn't thought of! I guess this means you need to know exactly what the system calls are, and you don't want e.g. open being rewritten to openat(...AT_FDCWD) behind your back.

Incidentally, Issue 24975, Audit and document the system calls made for each "primitive" IO API, sounds relevant to this discussion.

Note that almost all people I've asked "Go or Rust for server-side development" have answered "Go". There are lots of reasons for this, but one is particularly relevant here: They don't have to worry about DLL hell when compiling a Go program and then deploying it on their systems because Go executables are (usually) self-contained.

But so are Rust executables! Rust-language dependencies are statically linked; it's only libc, plus any other third-party crates that are dynamically linking third-party C libraries. If you're running pure-Rust code, you shouldn't have DLL hell per se. The only thing they depend on is the platform (kernel + libc).

If the problem is Rust binaries that are depending on a newer version of glibc (via symbol versioning), it seems like it should be doable to get rustc to build things that only require an older glibc. At least, this is a relatively smalla mount of work compared to the work of implementing an entire libc. I'm assuming that the target systems do have a glibc, just an older one (e.g., RHEL 5 has glibc 2.5), but if there's a use case for systems with no glibc installed (empty chroots/containers? Android?) then this wouldn't work.

Go executables also link against glibc if you use networking stuff IIRC.

I have a contrary opinion: Iā€™d prefer Rust to use as much libc as possible to minimize overhead of its own stdlib (edit: on OS X and Linux specifically).

I write Rust programs that depend on many system libraries (which are in C), and I write Rust plugins/libraries for use in other C/C++/ObjC programs, so for all my purposes libc is already present in these programs, and itā€™s ā€œfreeā€.

I donā€™t see a point in removing libc as a dependency until Rust ecosystem is so large that itā€™s feasible to write a complex program using only pure Rust dependencies so that Rust stdlib is the only thing needing libc (and it might never be possible for programs using platformsā€™ standard GUI toolkits).

I'd prefer Rust to use as much libc as possible to minimize overhead of its own stdlib.

Can you be more specific? What sort of libc features would you like to see used? Are you motivated by performance?

Regardless, I think it's a bad idea for us to use more C than is absolutely necessary. Rust is a systems language, and most of the logic you'd be pulling in from C could probably be more easily and more safely implemented in Rust. Even if it were a little harder to write in Rust, I think that's a price we should pay for having a self-supporting, self-contained memory-safe systems language.

If I've read closely enough, this conversation also hasn't had someone point out the fact that C standard libraries can and do have memory- and overflow-related bugs. Serguey Parkhomovsky and I patched an out-of-bounds read in OpenBSD's nlist(3) just last week. Glibc had a trivial buffer overflow in gethostbyname(3) this year. An integer overflow in its strncat(3) was reported two weeks ago. If we start pulling C logic into Rust for the sake of performance or code reuse, we risk being vulnerable to these. And they're exactly what Rust is meant to prevent.

1 Like

Please not on Windows. On windows you gain nothing by using libc, everything is available through system libraries (aside from math functions and memcpy + friends) and libc is just an additional layer of restrictive overhead that can't even do everything we need.

fn main(){} compiles to a 300KB executable even with -Clto. I realize itā€™s not that much in the grand scheme of things, and that C has an unfair advantage here, but it still bothers me, and Iā€™m afraid it also affects how others perceive Rust.

I know a 33KB C library with no dependencies is an easy sell for everyone. Iā€™d like to rewrite it in Rust, but Rust makes it literally 10 times larger adds a second stdlib to non-Rust programs using it.

The majority of that 300k is jemalloc. Switching to the system allocator pulls a couple hundred KB off of that.

1 Like

Especially, you don't want to have to create a seccomp-bpf policy for glibc and then find out it doesn't work with musl C, or that you need different versions of your seccomp policy for different versions of glibc or other libcs that you might not even know about.

I understand that point of view and I think that makes a lot of sense for people. However, I also think it makes sense to start the work on a direct-syscall implementation of the Rust std library now, targeting, say, Linux kernel 4.4 and above (i.e. no polyfills for features missing in earlier kernel versions, at least in the first release). Such an implementation would immediately be useful for some cool embedded projects that are built on top of Linux. If somebody were to start such a project, I'd review and test patches for it and help with the planning. Feel free to ping me: brian@briansmith.org / briansmith (Brian Smith) Ā· GitHub.

1 Like

Even today, people are pointing out as-yet-unfiled bugs in musl C on Twitter:

https://twitter.com/johnregehr/status/684126374966198281

https://twitter.com/jfbastien/status/684128389750325250

with more to follow when people get around to it:

https://twitter.com/jfbastien/status/684143761866162176

(And the system allocator is now the default for static/dynamic libraries.)

fn main(){} compiles to a 300KB executable even with -Clto. I realize it's not that much in the grand scheme of things, and that C has an unfair advantage here, but it still bothers me, and I'm afraid it also affects how others perceive Rust.

I know a 33KB C library3 with no dependencies is an easy sell for everyone. I'd like to rewrite it in Rust, but Rust makes it literally 10 times larger adds a second stdlib to non-Rust programs using it.

If you care a lot about the binary size, not just link dynamically? In the latest nightly, your stub program compiles to 613 KB with no flags, 674 KB with -C lto (strangely), and 7.9 KB with -C prefer-dynamic.

Also, using a null program as an example may be most surprising, but I suspect that the binary size scales relatively slowly as the program grows. I suppose I can check the binary size of large Rust projects later today.

As another anecdote, it seems to me that the "really big binary" stigma hasn't harmed Go. To be fair, though, it's a slightly higher-level language.

I couldn't expect that binary to work on other people's computers.

Perhaps for server-side usage? I've rewritten a Go tool in C and won't be using Go for redistributable apps, because that one tool alone was 5 times larger than a bundle of 4 other C programs.

Whoa! I didn't realise it's that big. Thank you for the tip.

#![feature(alloc_system)]
extern crate alloc_system;
fn main() {}

goes down to 70KB which makes me happy :slightly_smiling:

1 Like

As another anecdote, it seems to me that the "really big binary" stigma hasn't harmed Go.

Perhaps for server-side usage? I've rewritten a Go tool in C and won't be using Go for redistributable apps, because that one tool alone was 5 times larger than a bundle of 4 other C programs.

I don't really see the motive for this binary size fixation, though. The binaries aren't that big. Considering that almost everyone has at least 128 GB HDs these days and that network traffic is effectively free, it just doesn't matter. Maybe there are issues with paging, but that's a totally different topic.

Anyway, this is only tenuously related to the current topic: using libc in Rust. Which additional libc functions are you proposing that we use, and where?

With x86_64-pc-windows-msvc I get 91KB for such an executable. If I do #![feature(start)] #[start] fn start(_: isize, _: *const *const u8) -> isize { 0 } I can bring it down to 10KB, due to being able to avoid the Rust entry point, and so LTO can strip out all the expensive panic machinery that is no longer needed. Note that -msvc does not include jemalloc by default.

This is what Libsystem, or, the great libstd refactor - #11 by arcnmx was partially meant to make possible but no one seemed to have comments on whether the approach was worth pushing forward with or not...

I'll reply over there.

Iā€™ve changed my mind. Given the recent (and probably not the last) glibc vulnerability, and that most of the filesize overhead is from jemalloc and not Rustā€™s stdlib, Iā€™m not so sure about depending on libc any more.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.