Proposal: have the `gnu` toolchain save debugging information in separate files like the `msvc` toolchain


#1

Edit: following @Shnatsel’s comment, I updated my original post removing all references to C runtime linkage, and leaving only the main point about saving debugging information in separate files.

Rust programs are known to have larger binary sizes than C programs. This was, and in part still is, due to several reasons as explained in the FAQ.

One of the reasons reported in the FAQ, Jemalloc, has already been removed on nightly. Another reason, debug symbols, is actually avoided for the msvc toolchain on Windows by saving debugging information in separate files (rather than in the executable itself as the gnu toolchain does).

I made a brief assessment of binary sizes of basic “Hello world” programs in C, C++, and Rust, on Windows and Linux, by using the following compilers:

Windows:

  • Build Tools for Visual Studio 2017 (cl)
  • MinGW-W64 gcc / g++ 8.1.0
  • rustc 1.32 nightly

Linux:

  • gcc / g++ 6.3.0
  • rustc 1.32 nightly.

Note: the results reported below for Rust are obtained with default mode compilation, but I verified no significant difference with release mode compilation.

C

#include <stdio.h>
int main() {
    printf("hello, world!\n");
}

Windows:

  • cl: 113.2 kb
  • cl -Zi: two files, executable 494.0 kb and PDB file 4894.7 kb
  • gcc: 54.0 kb
  • gcc -g: 54.7 kb

Linux:

  • gcc: 8.6 kb
  • gcc -g: 11.1 kb

C++

#include <iostream>
int main() {
    std::cout << "Hello world!" << std::endl;
}

Windows:

  • cl: 221.2 kb
  • cl -Zi: two files, executable 1041.4 kb and PDB file 8368.1 kb
  • g++: 56.9 kb
  • g++ -g: 76.6 kb

Linux:

  • g++: 9.3 kb
  • g++ -g: 28.8 kb

Rust

fn main () {
   println!("Hello, world!");
}

Windows:

  • rustc: two files, executable 141.8 kb and PDB file 1232.9 kb

Linux:

  • rustc: single executable file 2401.9 kb

The size of the two binary files differs so much because the gnu toolchain on Linux (but the same applies to the gnu toolchain on Windows) saves the debugging information in the binary itself, while the msvc toolchain saves the debugging information in a separate PDB file.

Actually, also the gnu building tools allow to save debugging information in separate files, by using the following commands:

objcopy --only-keep-debug foo foo.dbg
objcopy --strip-debug foo
objcopy --add-gnu-debuglink=foo.dbg foo

as documented here.

Otherwise the following commands can be used, with same results:

objcopy --only-keep-debug foo foo.dbg
strip -g foo
objcopy --add-gnu-debuglink=foo.dbg foo

as documented here.

Applying those commands to the Rust program compiled with gnu toolchain on Linux I get the following results (note: they do not seem to work with the Rust program compiled with gnu toolchain on Windows, but maybe this is a limitation of the MinGW-W64 implementation, and anyway IMHO this fact is not that relevant):

Linux:

  • rustc: two files, executable 256.2 kb and dbg file 2201.9 kb

Proposal

To vastly reduce the binary size of Rust programs on Linux, I propose to have the gnu toolchain save debugging information in separate files (like the msvc toolchain already does) by using the objcopy tool.


#2

I would be very surprised if libc was linked statically on Linux. Can you run ldd on the generated binary on Linux to inspect what it links against? I’m pretty sure libc will be on the list.

Also, if you want to unify the linking, I’d move towards dynamic linking, not static linking. There are many ways in which linking to libc statically is a bad idea and is going to backfire, but the most obvious one is the inability to propagate security updates to it or even tell which Rust executables have been linked against a vulnerable version of libc and need to be rebuilt.


#3

Yes you are right, ldd shows that also the gnu toolchain on Linux links the C runtime dynamically (and it does so even with +crt-static). This is also explained in the Rust Reference, so sorry my fault. I am going to update my original post later today, keeping only the point to save debugging information in separate file to reduce binary size.