pre-RFC: Numeric Debug Formatting

Currently format! supports combining hex formatting (x) and debug formatting (?). This allows one to get hexadecimal numbers in the debug output.

However, this is currently very limited:

  • Uses an internal-only API not available to third-party types
  • Debug alt-mode affects formatting at all levels: you can't use # to get the 0x in front of hexadecimals without also triggering "alt mode" (usually pretty-printed) debug output for everything else

These limitations have caused the library team to pause adding more number formatting options in Debug output.

Sketch

Essentially, the idea implemented in three parts:

  1. Change the syntax so # only applies to only the subsequent character, rather than the whole formatter.
  • {:x?} behavior does not change
  • use {:#x#?} to get the same behavior as the current {:#x?}
  • use {:#x?} to get 0x prefix without pretty-printing
  • use {:x#?} to get pretty-printing without 0x prefix
  1. Extend the syntax to support octal, binary, etc
  2. Provide a .debug_list()-like helper for delegating to LowerHex, etc
pub struct DebugNumber<'b, 'c, T: ?Sized> {
    /// Number being formatted
    number: &'c T,

    /// New formatter built from settings after the `?`
    formatter: fmt::Formatter<'b>,
    /// Number formatting mode
    mode: DebugNumberMode,

    /// Function to use for `?`
    default: fn(&T, &mut fmt::Formatter<'b>) -> fmt::Result,
    /// Function to use for `?x`
    lower_hex: fn(&T, &mut fmt::Formatter<'b>) -> fmt::Result,
    /// Function to use for `?b`
    binary: fn(&T, &mut fmt::Formatter<'b>) -> fmt::Result,

    // ....
}

impl<'a> Formatter<'a> {
    /// Creates a `DebugNumber` builder designed to assist 
    /// with creation of `fmt::Debug` implementations for
    /// numeric values.
    ///
    /// Used to enable alternate number formatting flags
    /// in combination with `Debug`, like in `{:?x}`.
    ///
    /// # Examples
    /// 
    /// ```
    /// impl fmt::Debug for u32 {
    ///     fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
    ///         fmt
    ///             .debug_number(self, fmt::Display::fmt)
    ///             .lower_hex()
    ///             .binary()
    ///             .finish()
    ///     }
    /// }
    /// ```
    pub fn debug_number<T: fmt::Debug>(number: &'c T, default: fn(&T, &mut fmt::Formatter<'b>) -> fmt::Result) -> Self {
        DebugNumber {
            number,
            formatter,
            mode,
            default,
            lower_hex: default,
            binary: default,
        }
    }
}

Which has a function to enable each

impl<'b, 'c, T: fmt::Debug + ?Sized> DebugNumber<'b, 'c, T> {
    /// Enable `?x` formatting for this number.
    pub fn lower_hex(&mut self) -> &mut Self
    where
        T: fmt::LowerHex
    {
        self.lower_hex = |number, formatter| {
            fmt::LowerHex::fmt(number, formatter)
        };

        self
    }

    /// Enable `?b` formatting for this number.
    pub fn binary(&mut self) -> &mut Self
    where
        T: fmt::Binary
    {
        self.binary = |number, formatter| {
            fmt::Binary::fmt(number, formatter)
        };

        self
    }

    // ....

    /// Complete the formatting, delegating to whichever
    /// numeric formatting was specified after the `?`.
    pub fn finish(&mut self) -> fmt::Result {
        match self.mode {
            DebugNumberMode::LowerHex => (self.lower_hex)(self.number, &mut self.formatter),
            DebugNumberMode::Binary => (self.binary)(self.number, &mut self.formatter),

            // ....

            _ => (self.default)(self.number, &mut self.formatter),
        }
    }
}

API proof-of-concept playground

Links

4 Likes

Technically the API could be used for the current syntax. But given how ambiguous it is, and that it only supports upper/lower hex, I figure they should probably go together.

Just to be clear - the goal here is to allow you to separately print these three things, correct?

// current `{arr:x?}`
[8, 9, a, b, c]

// current `{arr:#x?}`
[
    0x8,
    0x9,
    0xa,
    0xb,
    0xc,
]

// Currently not possible without iteration
[0x08, 0x09, 0x0a, 0x0b, 0x0c]

Or could you add some examples to the post?


  • {:x?} would become {:?x}
  • {:#x?} would become {:#?#x}

I think that rearranging the formatters would be a pretty big pill to swallow at this point, is that necessary? Instead, maybe we could apply a rule like "format modifiers (#, 02, etc) only applies to the next formatter (x, b,, o, ?, etc)". For example:

  1. :#02x? would print 0x00 style in a one line list (currently prints 0x0 format in pretty list, note there is no leading 0)
  2. :#02x#? would print 0x00 style in a pretty list (currently errors)
  3. :02x#? would print 00 in a pretty list (currently errors)

If this were the case, I believe that the only breaking change would be that :#x? would go from printing a 0x0-style pretty list to a 0x0-style one-line list. Which is annoying but probably doable without an edition, since I don't think we guarantee the debug output.

This would also open the door for something like :n16?/:#02xn16? to prettyprint with 16 items per line rather than one per line.

If I missed something about why this wouldn't be possible, I suppose that would just be good to include in the summary.

Correct


It is certainly possible to reuse the existing syntax with this same proposed API. However, there are a few reasons I would prefer to change it:

  1. Even though we make no guarantees for Debug output, I would rather not change the behavior out from under people. Introducing new behavior is likely to cause confusion for people who are used to the existing behavior, and leads to decay of older Rust tutorials, books, and other materials.

  2. ? acts as a very useful delimiter that visually splits the overall formatting options from the numeric ones. This is likely to prevent mistakes and confusion.

  3. By using ? as a delimiter, you can specify the number formatting options without specifying an alternate number mode. Example: {:?04} can pad just the numbers to four digits. This is probably the strongest argument, as I can't think of any other way to make this unambiguous without adding a decimal specifier.

I've made a note to include this reasoning in the eventual RFC under alternatives.

My 2¢ is that I'd rather do something like throw "this output may have changed, do x instead" warning for {:#x?} during edition 2024 only, and allow suppressing it - this would only affect anyone using this formatting (about 470 uses per this quick search), and adding a #[allow(...)] is a once per crate change. Compared to reordering the sigils, which affects everybody using debug formatting, and makes each instance need to update (cargo fix can help, but it's still a lot more churny).

  1. By using ? as a delimiter, you can specify the number formatting options without specifying an alternate number mode. Example: {:?04} can pad just the numbers to four digits. This is probably the strongest argument, as I can't think of any other way to make this unambiguous without adding a decimal specifier.

This is a good use case, but why would {:04?} be ambiguous? It currently seems to do the right thing

fn main() {
    let arr = [1, 2, 3u32];
    println!("{:04?}", arr);
}
[0001, 0002, 0003]
1 Like

Okay I guess I haven't experimented enough. It appears that most (if not all besides #) debug format options are currently only applied for numbers.

For some reason I thought that {: ^30?} would pad the result of the debug output. But that is not the general case. For a list of numbers, it applies to each number in the list rather than the list as a whole.

I'll need to think about this and revise the proposal. Thanks for pointing it out.

1 Like

Okay so I've modified the proposal to use the {:#x#?} syntax. It works just as well as my original proposal, with one hypothetical exception. My original proposal supports separate formatting options (such as padding) for numbers vs the whole value.

With the revised proposal, there would be no way to express "pad the whole debug output to a width of 30 and pad each number to 2 digits" because it would look like {:0230?} (ambiguous with pad each number 230 digits) unless we use the number mode as a delimiter and add a new one for "default" or "decimal": {:02d30?}.

This is not currently supported, and probably isn't nearly as helpful in Debug output as it is in Display output. So I decided it wasn't worth the churn.

1 Like

Awesome, looks RFC-ready to me :slightly_smiling_face:. Always find myself looping so I would love to have this

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.