Rustc can be "tricked" into generating exponentially large debug info

Thiez · May 21, 2016, 6:54pm

After reading this topic on the D forum, I was wondering how rustc would behave when prodded in the right places. I present the following program:

#[derive(Copy, Clone, Debug)]
struct S<T, U, V>(T, U, V);

fn f<T: Copy>(t: T) -> S<T, T, T> { S(t, t, t) }

fn main() {
    let val = f(f(f(f(f(f(f(f(f(f(f(5)))))))))));
    println!("{:?}", ((((((((((val.0).0).0).0).0).0).0).0).0).0).0);
}

This program, when compiled with rustc -g --emit asm explosion.rs produces a 10MB assembly file, most of which consists of generated function and type names. One of the shorter types looks like this:

"_ZN9explosion102f<explosion::S<explosion::S<i32, i32, i32>, explosion::S<i32, i32, i32>, explosion::S<i32, i32, i32>>>E"

When I compile with rustc -g --emit obj explosion.rs I hit an assertion:

Assertion failed: isIntN(Size * 8 + 1, Value) && "Value does not fit in the Fixup field", file C:\bot\slave\nightly-dist-rustc-win-msvc-64\build\src\llvm\lib\Target\X86\MCTargetDesc\X86AsmBackend.cpp, line 115

This is on Windows, I’m not sure how it behaves on linux.

Should this behavior (generating exponentially large debug symbols) be considered a bug? If so, what should be done about it?

sfackler · May 21, 2016, 7:31pm

I wouldn’t call that being tricked - if you write a program involving exponentially large types, it seems to me like you’d expect debuginfo to be exponentially large. The LLVM assert isn’t great though - sounds like there are some checks missing on the rustc side.

Thiez · May 22, 2016, 8:47am

With types like PhantomData it can easily be done without making the actual type take an exponential amount of space, e.g. the following program, when compiled with rustc -g main.rs, produces a binary of about 648MB on my machine (it also takes a long time to compile, and uses a lot of memory while doing so), even though different instances of S are never larger than a byte.

use std::marker::PhantomData;

#[derive(Copy, Clone)]
struct S<T, U, V>{
    data: T,
    pd1: PhantomData<U>,
    pd2: PhantomData<V>
}

impl<T,U,V> S<T,U,V> {
    fn new(data: T) -> S<T, U, V> {
        S { data: data, pd1: PhantomData, pd2: PhantomData }
    }
    fn compound(self) -> S<T, S<T,U,V>, S<T,U,V>> {
        S::<T, S<T,U,V>, S<T,U,V>>::new(self.data)
    }
}

fn main() {
    let val = S::<u8, (), ()>::new(5);
    // Season to taste
    let val = val
        .compound().compound().compound().compound().compound().compound().compound()
        .compound().compound().compound().compound();
    println!("{}", val.data);
}

My question is: is it really desirable to have debug symbols that are more than a few kB in length? Because no human is going to be able to use those symbols anyway.

huon · May 22, 2016, 12:54pm

PhantomData/the actual runtime size of the types seems somewhat irrelevant to me: the types themselves contain an exponential amount of information. Of course, you are correct that most humans are unlikely to want to actually read every part of such a large type, but I could imagine tooling wanting to know non-corrupted details about types/functions.

matthieum · May 22, 2016, 1:59pm

Could you explain to the lambda user how these types cause an exponential blow-up?

I mean, if we look at @Thiez’s latest example, there are only a very limited number of types:

type S0 = S<u8, (), ()>;
type S1 = S<u8, S0, S0>;
…
type S11 = S<u8, S10, S10>;

That’s only ~12 different types, all in all, so I suspect that somehow the Debug information is completely inlined and does not use aliases.

Am I correct? Would it be possible to use aliases?

system · March 25, 2019, 8:26am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
There needs to be some way to debug exploding type lengths	4	892	June 18, 2020
Why does Rust generate 10x as much unoptimized assembly as GCC? compiler	23	4667	November 2, 2021
PSA: Debugging rustc type layouts	10	1239	July 4, 2020
Proposal: have the `gnu` toolchain save debugging information in separate files like the `msvc` toolchain compiler	3	1396	March 25, 2019
Debugging a crash in rustc using gdb internals	5	1436	March 25, 2019

Rustc can be "tricked" into generating exponentially large debug info

Related topics