Option to turn `Debug` into no-op

I think Rust/Cargo should have an option for making Debug a no-op (at least neuter derive(Debug) and perhaps {:?} in the formatter too), because:

  • It would help find and fix/banish crates that parse Debug output. Rust's derive(Debug) is not supposed to be a stable parseable format, and crates are not supposed to hack around privacy or lack of better interfaces by using the debug print format.

  • It would hopefully reduce binary sizes, since a bunch of formatting code and strings would be gone.

  • In closed-source software symbol names are not supposed to be in the executable, but Debug impls leak them even in stripped executables.

I'm wondering what should be the scope of the no-Debug option. Should it just disable derive(Debug), or disable all debug formatting? Maybe go all the way and make assert! and unwrap() hardcode empty strings too?

14 Likes

People will parse whatever they can parse. You'll never work around that. There is already no guarantee about the details of Debug formatting, nor any guarantee that the type can be parsed in a meaningful way (parts of the data may just be omitted).

On the other hand, you are basically proposing to break all debuggers. Not that they work well with Rust to begin with. But it's even worse if you can't print out the debug representation of a type.

If you don't use it, it won't be included. If you want to do debug logging, you can write a macro which conditionally disables it.

If you care about such things, then you should run your code through an obfuscator, and it can easily handle debug impls. It can replace all Debug::fmt impls with a no-op, if you want.

1 Like

Many crates use Debug impls in logging and similar places, often because a type doesn't have a Display impl. For example, consider Duration.

I'm not suggesting we shouldn't try this option; on the contrary, it makes sense to experiment with. But the scope of what would need to be "fixed" would be much broader than just "parsing Debug output".

Clippy already has a "restriction" lint for using Debug formatting in any way.

We could also, theoretically, introduce a mechanism in miri to do taint-tracking of bytes obtained via a Debug impl, and complain if they're ever parsed in a value-dependent way.

Of the concerns you're raising here, is your primary concern "parsing Debug output", or size / proprietary software? There are changes we could make that would help more with one than the other.

There are also lots of things we could do with build-std, if we can get that stabilized.

Currently parsing of debug formatting works in practice, so people don't have much incentive to avoid it. If an option that breaks it existed, it would be a bigger motivation to avoid parsing it. This is similar to abuse of panic for non-critical control flow — you can point out that panic=abort exists.

Debuggers don't use Debug impl. Debuggers are already "broken" by options like strip = true and opt-level = 2. Such options already exist, because in release binaries debugging ability may be irrelevant, or even explicitly undesirable, and then breaking debugging is a feature.

That's not true in practice. It's extremely impractical to remove derive(Debug) from all dependencies, and debug-printing machinery is pulled in by various panicking macros and functions, which are also hard to remove from all places.

4 Likes

Obviously if the option was something like debug_print = false, then users should not expect their debug printing to keep working exactly as before. But Duration in particular should have Display IMHO. SystemTime won't have Display, but also its Debug impl is awful, so I don't think anyone will miss it.

I do care about all the issues I've listed, but most about small executable sizes. I don't want any Rust-generated strings in my executables, and I do not care what panics or debug logs look like in release mode. I do use build-std already for features like panic-immediate-abort.

I think eradication of parsing of debug impl's is important for the health of the Rust ecosystem (it's a hack, and if people continue to rely on it, it will ossify Rust's Debug and it won't be possible to change it in the future, and Rust will be left maintaining the worst reflection API). Finding it with miri is an interesting idea, but it just finds the offending code, but doesn't create a motivation to stop the bad practice.

Where have you seen Debug-parsing? Who's doing that? Any reputable crates?

Do you have any specific examples of debug parsing in mind? The things I'm most familiar with is parsing logs, which should be a benefit, not a problem to solve. You can't really violate privacy by parsing debug strings, since you won't be able to create the types anyway.

A much bigger privacy violation is using transmute and offset_of! (which you seem to not consider an issue). That thing cannot be disabled in any way, and allows to directly read (and write) private memory of any type.

That's true, but Debug is not the only thing which affects the panic formatting. It seems more reasonable to have an option for shorter or no formatting messages in panics, for the sake of small binaries. Or even remove unwinding entirely, and use panic=immedaite-abort. That should solve the Debug issue as well.

You'll get no argument from me on that.

Neither does an option that people can just not set.

It depends on who sets the option.

The panic=abort option absolutely is a motivator for libraries to avoid using unwinding for control flow.

In the same fashion, if the binary is the one to set debug=none, that will be a motivator for libraries to avoid relying on Debug's output.

2 Likes

There is one case where Debug is the only way I can access the information I need:

I wholeheartedly agree that this should really be fixed, but the issue lies with std::error::Error and std::io::Error — these simply do not provide the means to extract useful information.

@kornel Fixing this situation should be a precondition for taking steps against using Debug formatting.

1 Like

Are you referring to the issue with io::Error::source not giving back the immediate error?

There is an inherent method for getting it (std::io::Error::get_ref) so it should be possible to inspect any error in the chain using a loop going through Error::source and attempting downcasting to go through io::Error::get_ref as linked in the last comment on the issue.

Regarding the first point maybe we could somehow randomize derived Debugs? Change the field order, omit some sigils, include (or not) some parts of the module path. Things that keep it good enough for humans but make a parser's life miserable.

It would make it harder to write a parser, yes, but with Rust's strong parser ecosystem it would be a minor speed bump rather than any real deterrent. Debug impls usually include field names, and those are unique, making a parse of debug representation simply an unordered sequence of individual field parses.

1 Like

Thanks for that link, that hint didn’t exist when the code in question was written. I have tried it out and it looks like I can indeed get rid of the Debug inspection, yay!

resulting code
fn is_sim_open(error: &TransportError<std::io::Error>) -> bool {
    match error {
        libp2p::TransportError::MultiaddrNotSupported(_x) => false,
        libp2p::TransportError::Other(err) => {
            let err = err
                .get_ref()
                .and_then(|e| e.downcast_ref::<DnsErr<std::io::Error>>());
            if let Some(DnsErr::Transport(err)) = err {
                let err = err
                    .get_ref()
                    .and_then(|e| e.downcast_ref::<super::TransportError>());
                if let Some(TransportTimeoutError::Other(EitherError::A(EitherError::B(
                    UpgradeError::Apply(NoiseError::Io(err)),
                )))) = err
                {
                    err.kind() == ErrorKind::InvalidData
                } else {
                    false
                }
            } else {
                false
            }
        }
    }
}
1 Like

I don't know about him but what I really want is to reduce the size of binaries, both for wasm and embedded. Having a way to make Debug codegen more compact across all crates of a binary would be very useful.

But, just stripping out all Debug (or more generally, all uses of the format machinery, even in the stdlib; and, in particular, not ever invoking format in panics!) would be very useful already. This should perhaps includes manual Debug impls too.


But, and this sidetracks a bit from the original proposal, having an alternative, more compact codegen for #[derive(Debug)] would be even better, maybe using the same techniques as #[derive(uDebug)] from ufmt. The problem here is that the Rust ecosystem already adopts Debug, deriving it in multiple crates, but almost no crate derive uDebug, so using ufmt directly is a non-starter.

I thought the only real way to improve the situation is to fork the stdlib and make the appropriate changes to the Debug derive (at least) and maybe to the formatting machinery, but if we could configure this somehow (perhaps with std-aware Cargo and feature flags for the stdlib), it would be perfect.

FWIW, format_args! and the formatting machinery are finally moving towards fixing up maintainability and improving their implementation, e.g.

It's certain that std won't be as size-optimal as ufmt, but it's a lot more possible to improve std's formatting code size now with that PR landed. std's current implementation is known to not be great, and is definitely improbable; just not to the point of removing functionality.

3 Likes

FWIW, i've definitely encountered code using some of std's Debug in ways which definitely wouldn't be caught out by miri or any form of static analysis, typically this took the form of relying upon String/slices, etc to be usable in ways similar to this internals discussion Repr formatter with ShowRepr trait generating rust source code strings, from string values. If it was just strings, escape_debug would be an option here.

The problem with attempting static analysis of this is that the Debug output is only being parsed in a separate process. In this case the Debug output was being sent back to the rust compiler itself, I've managed to remove most of this, from the project except for some opaque types where we currently lack any trait bounds that would allow us to do anything else.

And FWIW, before I get dog-piled macros aren't really an option for the project in question the input isn't rust source code, source code generation are actually entirely optional.

This would make it incredibly annoying to diff two log files which contain Debug output.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.