Option to turn `Debug` into no-op

kornel · October 18, 2022, 11:21am

I think Rust/Cargo should have an option for making Debug a no-op (at least neuter derive(Debug) and perhaps {:?} in the formatter too), because:

It would help find and fix/banish crates that parse Debug output. Rust's derive(Debug) is not supposed to be a stable parseable format, and crates are not supposed to hack around privacy or lack of better interfaces by using the debug print format.
It would hopefully reduce binary sizes, since a bunch of formatting code and strings would be gone.
In closed-source software symbol names are not supposed to be in the executable, but Debug impls leak them even in stripped executables.

I'm wondering what should be the scope of the no-Debug option. Should it just disable derive(Debug), or disable all debug formatting? Maybe go all the way and make assert! and unwrap() hardcode empty strings too?

afetisov · October 18, 2022, 11:38am

People will parse whatever they can parse. You'll never work around that. There is already no guarantee about the details of Debug formatting, nor any guarantee that the type can be parsed in a meaningful way (parts of the data may just be omitted).

On the other hand, you are basically proposing to break all debuggers. Not that they work well with Rust to begin with. But it's even worse if you can't print out the debug representation of a type.

If you don't use it, it won't be included. If you want to do debug logging, you can write a macro which conditionally disables it.

If you care about such things, then you should run your code through an obfuscator, and it can easily handle debug impls. It can replace all Debug::fmt impls with a no-op, if you want.

josh · October 18, 2022, 11:45am

Many crates use Debug impls in logging and similar places, often because a type doesn't have a Display impl. For example, consider Duration.

I'm not suggesting we shouldn't try this option; on the contrary, it makes sense to experiment with. But the scope of what would need to be "fixed" would be much broader than just "parsing Debug output".

Clippy already has a "restriction" lint for using Debug formatting in any way.

We could also, theoretically, introduce a mechanism in miri to do taint-tracking of bytes obtained via a Debug impl, and complain if they're ever parsed in a value-dependent way.

Of the concerns you're raising here, is your primary concern "parsing Debug output", or size / proprietary software? There are changes we could make that would help more with one than the other.

There are also lots of things we could do with build-std, if we can get that stabilized.

kornel · October 18, 2022, 11:58am

Currently parsing of debug formatting works in practice, so people don't have much incentive to avoid it. If an option that breaks it existed, it would be a bigger motivation to avoid parsing it. This is similar to abuse of panic for non-critical control flow — you can point out that panic=abort exists.

Debuggers don't use Debug impl. Debuggers are already "broken" by options like strip = true and opt-level = 2. Such options already exist, because in release binaries debugging ability may be irrelevant, or even explicitly undesirable, and then breaking debugging is a feature.

That's not true in practice. It's extremely impractical to remove derive(Debug) from all dependencies, and debug-printing machinery is pulled in by various panicking macros and functions, which are also hard to remove from all places.

kornel · October 18, 2022, 12:23pm

Obviously if the option was something like debug_print = false, then users should not expect their debug printing to keep working exactly as before. But Duration in particular should have Display IMHO. SystemTime won't have Display, but also its Debug impl is awful, so I don't think anyone will miss it.

I do care about all the issues I've listed, but most about small executable sizes. I don't want any Rust-generated strings in my executables, and I do not care what panics or debug logs look like in release mode. I do use build-std already for features like panic-immediate-abort.

I think eradication of parsing of debug impl's is important for the health of the Rust ecosystem (it's a hack, and if people continue to rely on it, it will ossify Rust's Debug and it won't be possible to change it in the future, and Rust will be left maintaining the worst reflection API). Finding it with miri is an interesting idea, but it just finds the offending code, but doesn't create a motivation to stop the bad practice.

jkugelman · October 18, 2022, 12:54pm

Where have you seen Debug-parsing? Who's doing that? Any reputable crates?

afetisov · October 18, 2022, 1:00pm

Do you have any specific examples of debug parsing in mind? The things I'm most familiar with is parsing logs, which should be a benefit, not a problem to solve. You can't really violate privacy by parsing debug strings, since you won't be able to create the types anyway.

A much bigger privacy violation is using transmute and offset_of! (which you seem to not consider an issue). That thing cannot be disabled in any way, and allows to directly read (and write) private memory of any type.

That's true, but Debug is not the only thing which affects the panic formatting. It seems more reasonable to have an option for shorter or no formatting messages in panics, for the sake of small binaries. Or even remove unwinding entirely, and use panic=immedaite-abort. That should solve the Debug issue as well.

josh · October 18, 2022, 2:11pm

You'll get no argument from me on that.

Neither does an option that people can just not set.

CAD97 · October 18, 2022, 4:51pm

It depends on who sets the option.

The panic=abort option absolutely is a motivator for libraries to avoid using unwinding for control flow.

In the same fashion, if the binary is the one to set debug=none, that will be a motivator for libraries to avoid relying on Debug's output.

rkuhn · October 19, 2022, 8:12am

There is one case where Debug is the only way I can access the information I need:

github.com

ipfs-rust/ipfs-embed/blob/master/src/net/peers.rs#L772-L777


      
          let error = format!("{:?}", error);
          tracing::debug!(addr = %&addr, error = %&error, "non-validation dial failure");
          info.push_failure(normalize_addr_ref(addr, &peer_id).as_ref(), failure, true);
          // TCP simultaneous open leads to both sides being initiator in the Noise
          // handshake, which yields this particular error
          if error.contains("Other(A(B(Apply(Io(Kind(InvalidData))))))") {

I wholeheartedly agree that this should really be fixed, but the issue lies with std::error::Error and std::io::Error — these simply do not provide the means to extract useful information.

@kornel Fixing this situation should be a precondition for taking steps against using Debug formatting.

Nemo157 · October 19, 2022, 9:03am

Are you referring to the issue with io::Error::source not giving back the immediate error?

github.com/rust-lang/rust

Error::source and Error::cause do not expose immediate error in custom io::Error

opened 06:58PM - 14 Sep 22 UTC

jonhoo

T-libs-api C-bug A-error-handling

I tried this code ([playground](https://play.rust-lang.org/?version=stable&mode=…debug&edition=2021&gist=8652db49f5a01a73b64051a4aa9bc7d2)): ```rust use std::error::Error; #[derive(Debug, PartialEq, Eq)] struct E; impl std::fmt::Display for E { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { write!(f, "E") } } impl Error for E {} #[test] fn custom_io_has_source() { let e = E; let e = std::io::Error::new(std::io::ErrorKind::Other, e); assert_eq!(e.source().and_then(|s| s.downcast_ref::<E>()), Some(&E)); } ``` I expected to see this happen: the test passed Instead, this happened: the test failed with ``` thread 'custom_io_has_source' panicked at 'assertion failed: `(left == right)` left: `None`, right: `Some(E)`', src/lib.rs:18:5 ``` I don't _believe_ this is intentional, since the code standard library `impl Error for io::Error` specifically has: https://github.com/rust-lang/rust/blob/a92669638461836f41f54f95e396f9082bb91391/library/std/src/io/error.rs#L963 In fact, it was _specifically_ implemented in https://github.com/rust-lang/rust/pull/58963, but that PR doesn't appear to have included any tests. The test above fails even on 1.30.0 (which is the first Rust version with `fn source`). What's even more interesting is that `assert!(e.cause().is_some())` fails all the way back to Rust 1.0, so it sure seems like something is fishy here. I wonder if @thomcc may know how the source might be disappearing given his work in #87869? Another interesting tidbit here is that using [`io_error_downcast`](https://github.com/rust-lang/rust/issues/99262) we _do_ get the inner error, meaning there will (when that feature lands) be a discrepancy between `io::Error`'s `.source()` and `.downcast()`. ```rust assert_eq!(e.downcast::<E>().unwrap(), Box::new(E)); ``` ### Meta `rustc --version --verbose`: ``` rustc 1.63.0 (4b91a6ea7 2022-08-08) binary: rustc commit-hash: 4b91a6ea7258a947e59c6522cd5898e7c0a6a88f commit-date: 2022-08-08 host: aarch64-unknown-linux-gnu release: 1.63.0 LLVM version: 14.0.5 ```

There is an inherent method for getting it (std::io::Error::get_ref) so it should be possible to inspect any error in the chain using a loop going through Error::source and attempting downcasting to go through io::Error::get_ref as linked in the last comment on the issue.

the8472 · October 19, 2022, 2:17pm

Regarding the first point maybe we could somehow randomize derived Debugs? Change the field order, omit some sigils, include (or not) some parts of the module path. Things that keep it good enough for humans but make a parser's life miserable.

afetisov · October 19, 2022, 2:22pm

It would make it harder to write a parser, yes, but with Rust's strong parser ecosystem it would be a minor speed bump rather than any real deterrent. Debug impls usually include field names, and those are unique, making a parse of debug representation simply an unordered sequence of individual field parses.

rkuhn · October 19, 2022, 3:26pm

Thanks for that link, that hint didn’t exist when the code in question was written. I have tried it out and it looks like I can indeed get rid of the Debug inspection, yay!

resulting code

fn is_sim_open(error: &TransportError<std::io::Error>) -> bool {
    match error {
        libp2p::TransportError::MultiaddrNotSupported(_x) => false,
        libp2p::TransportError::Other(err) => {
            let err = err
                .get_ref()
                .and_then(|e| e.downcast_ref::<DnsErr<std::io::Error>>());
            if let Some(DnsErr::Transport(err)) = err {
                let err = err
                    .get_ref()
                    .and_then(|e| e.downcast_ref::<super::TransportError>());
                if let Some(TransportTimeoutError::Other(EitherError::A(EitherError::B(
                    UpgradeError::Apply(NoiseError::Io(err)),
                )))) = err
                {
                    err.kind() == ErrorKind::InvalidData
                } else {
                    false
                }
            } else {
                false
            }
        }
    }
}

dlight · October 19, 2022, 9:10pm

I don't know about him but what I really want is to reduce the size of binaries, both for wasm and embedded. Having a way to make Debug codegen more compact across all crates of a binary would be very useful.

But, just stripping out all Debug (or more generally, all uses of the format machinery, even in the stdlib; and, in particular, not ever invoking format in panics!) would be very useful already. This should perhaps includes manual Debug impls too.

But, and this sidetracks a bit from the original proposal, having an alternative, more compact codegen for #[derive(Debug)] would be even better, maybe using the same techniques as #[derive(uDebug)] from ufmt. The problem here is that the Rust ecosystem already adopts Debug, deriving it in multiple crates, but almost no crate derive uDebug, so using ufmt directly is a non-starter.

I thought the only real way to improve the situation is to fork the stdlib and make the appropriate changes to the Debug derive (at least) and maybe to the formatting machinery, but if we could configure this somehow (perhaps with std-aware Cargo and feature flags for the stdlib), it would be perfect.

CAD97 · October 19, 2022, 11:38pm

FWIW, format_args! and the formatting machinery are finally moving towards fixing up maintainability and improving their implementation, e.g.

github.com/rust-lang/rust

Rewrite and refactor format_args!() builtin macro.

rust-lang:master ← m-ou-se:format-args-2

opened 10:57AM - 25 Aug 22 UTC

m-ou-se

+1449 -1540

This is a near complete rewrite of `compiler/rustc_builtin_macros/src/format.rs`…. This gets rid of the massive unmaintanable [`Context` struct](https://github.com/rust-lang/rust/blob/76531befc4b0352247ada67bd225e8cf71ee5686/compiler/rustc_builtin_macros/src/format.rs#L176-L263), and splits the macro expansion into three parts: 1. First, `parse_args` will parse the `(literal, arg, arg, name=arg, name=arg)` syntax, but doesn't parse the template (the literal) itself. 2. Second, `make_format_args` will parse the template, the format options, resolve argument references, produce diagnostics, and turn the whole thing into a `FormatArgs` structure. 3. Finally, `expand_parsed_format_args` will turn that `FormatArgs` structure into the expression that the macro expands to. In other words, the `format_args` builtin macro used to be a hard-to-maintain 'single pass compiler', which I've split into a three phase compiler with a parser/tokenizer (step 1), semantic analysis (step 2), and backend (step 3). (It's compilers all the way down. ^^) This can serve as a great starting point for https://github.com/rust-lang/rust/issues/99012, which will only need to change the implementation of 3, while leaving step 1 and 2 unchanged. It also makes https://github.com/rust-lang/compiler-team/issues/541 easier, which could then upgrade the new `FormatArgs` struct to an `ast` node and remove step 3, moving that step to later in the compilation process. It also fixes a few diagnostics bugs. This also [significantly reduces](https://gist.github.com/m-ou-se/b67b2d54172c4837a5ab1b26fa3e5284) the amount of generated code for cases with arguments in non-default order without formatting options, like `"{1} {0}"` or `"{a} {}"`, etc.

It's certain that std won't be as size-optimal as ufmt, but it's a lot more possible to improve std's formatting code size now with that PR landed. std's current implementation is known to not be great, and is definitely improbable; just not to the point of removing functionality.

ratmice · October 20, 2022, 3:13am

FWIW, i've definitely encountered code using some of std's Debug in ways which definitely wouldn't be caught out by miri or any form of static analysis, typically this took the form of relying upon String/slices, etc to be usable in ways similar to this internals discussion Repr formatter with ShowRepr trait generating rust source code strings, from string values. If it was just strings, escape_debug would be an option here.

The problem with attempting static analysis of this is that the Debug output is only being parsed in a separate process. In this case the Debug output was being sent back to the rust compiler itself, I've managed to remove most of this, from the project except for some opaque types where we currently lack any trait bounds that would allow us to do anything else.

And FWIW, before I get dog-piled macros aren't really an option for the project in question the input isn't rust source code, source code generation are actually entirely optional.

ChrisJefferson · October 24, 2022, 3:38pm

This would make it incredibly annoying to diff two log files which contain Debug output.

system · January 22, 2023, 3:38pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
#[derive(Debug)] by default language design	84	9655	July 31, 2020
On DebugTuple and {:#?} language design	14	568	April 24, 2025
Debugging non-`Debug` items in debug releases libs	4	1251	July 11, 2021
RUST_LOG vs RUSTC_LOG compiler	11	2853	March 25, 2019
More Useful Debug Formatting libs	3	1435	March 25, 2019

Option to turn `Debug` into no-op

Related topics