Idea: expose the buffer for `fmt::Arguments`, so users can pass it around

theme · September 23, 2024, 4:52am

Motivation

As soon as you try to use fmt::Arguments in any other way than "immediately pass it to some function", you're in a lot of pain. For example, you can't even store a fmt::Arguments in a local variable, much less return it from a function or store a collection of it.

This is because fmt::Arguments needs to store buffers containing variable-length slices that has information about each of the things passed to format_args!(). Therefore, format_args!() essentially creates an array as a hidden local variable, and then produces a fmt::Arguments that refers to that local variable.

To fix this, I propose exposing an API that allows users to store these buffers wherever they want. This will not affect the performance of existing code in any way, and is fully backwards-compatible. I believe it is also forwards-compatible with any changes we could make in the future.

Proposed API

Currently, fmt::Arguments is implemented as follows:

pub struct Arguments<'a> {
    pieces: &'a [&'static str],
    fmt: Option<&'a [Placeholder]>,
    args: &'a [Argument<'a>],
}

I would propose adding the following to the fmt module (where pub marks public API):

pub trait ArgumentsBuffer: Sealed {}

trait Sealed {
    fn fmt<'a>(&'a self) -> Option<&'a [Placeholder]>;
    fn args<'a>(&'a self) -> &'a [Argument<'a>];
}

// Implements ArgumentsBuffer and Sealed. Is covariant.
struct ArgumentsBufferArrays<'a, const N: usize, const M: usize> {
    fmt: Option<[Placeholder; N]>,
    args: [Argument<'a>; M],
}

pub struct RawArguments<'a, B: ?Sized> {
    pieces: &'a [&'static str],
    buffer: B,
}

impl<'a, B: ArgumentsBuffer + 'a + ?Sized> RawArguments<'a, B> {
    pub fn to_arguments(&'a self) -> Arguments<'a> {
        Arguments {
            pieces: self.pieces,
            fmt: self.buffer.fmt(),
            args: self.buffer.args(),
        }
    }
}

impl<'a, T: Unsize<U> + ?Sized, U: ?Sized>
    CoerceUnsized<RawArguments<'a, U>> for RawArguments<'a, T> { }

There would be a raw_format_args!() macro with the same syntax as format_args!(). This raw_format_args!() macro would evaluate to a value of type RawArguments<'a, impl ArgumentsBuffer + 'a>, where the hidden type inside the impl is ArgumentsBufferArrays with the appropriate array lengths, and 'a is the lifetime where references to all of the formatted values are valid.

fn raw_format_args_example(x: &'a impl Display + ?Sized)
    -> RawArguments<'a, impl ArgumentsBuffer + 'a>
{
    raw_format_args!("{}", *x)
}

API Usage

For existing users of format_args!() and fmt::Arguments, nothing would change. However, users who wish to pass around format arguments would be able to use raw_format_args!() to create a RawArguments. This RawArguments value would contain a buffer inside, as opposed to referring to a local variable like Arguments does. Therefore, RawArguments can be passed around wherever the user wants.

If the user wants to store a heterogenous collection of RawArguments, they can unsize-coerce a Box<RawArguments<'a, impl ArgumentsBuffer + 'a>> into a Box<RawArguments<'a, dyn ArgumentsBuffer + 'a>>, and put them into a collection of their choice.

To actually print a RawArguments, it can be converted into an Arguments and then printed as normal.

Links

Links for background information:

bjorn3 · September 23, 2024, 12:18pm

It wouldn't be forwards compatible with merging the fmt and args fields. I did personally prefer if we get the formatting machinery to a state where we are actually happy with how it works before adding any new features that could constrain how we implement the formatting machinery or make it in any other way harder to change it.

theme · September 23, 2024, 1:01pm

It's perfectly forwards-compatible with merging the fmt and args fields. They are accessible only via the Sealed trait, which is private. Therefore, the trait can be changed without breaking any user's code.

I don't think that this interface would constrain the implementation in anyway.

theme · September 23, 2024, 1:12pm

For reference, here is all the API that users see:

pub trait ArgumentsBuffer: Sealed {}

pub struct RawArguments<'a, B: ?Sized> { .... }

impl<'a, B: ArgumentsBuffer + 'a + ?Sized> RawArguments<'a, B> {
    pub fn to_arguments(&'a self) -> Arguments<'a> { .... }
}

impl<'a, T: Unsize<U> + ?Sized, U: ?Sized>
    CoerceUnsized<RawArguments<'a, U>> for RawArguments<'a, T> { }

And also, a RawArguments can be constructed via raw_format_args!()

scottmcm · September 23, 2024, 5:39pm

I think part of the plan with super let is that the macro will be able to use super let and thus at least putting format_args into a local will work.

theme · September 24, 2024, 1:52am

Now that I've thought about it more, if RawArguments is changed from a struct to a trait, then the public API can be reduced to just the following:

// object-safe, so people can store Box<dyn RawArguments>
pub trait RawArguments: Sealed {
    fn to_arguments<'a>(&'a self) -> Arguments<'a>;
}

Usage:

fn raw_format_args_example<'a>(x: &'a impl Display + ?Sized)
    -> impl RawArguments + 'a
{
    raw_format_args!("{}", *x)
}

Would that work better for forwards compatibility?

Topic		Replies	Views
Format_args!() with long lifetimes language design	17	1372	December 10, 2023
Several ideas about formatting	2	286	September 2, 2024
Changing core::fmt for speed libs	9	3681	March 25, 2019
Feature request: std::fmt::Write::write_line libs	19	1378	September 22, 2021
Add an easier way to append `T: Display` to `String` to stdlib? libs	26	2648	October 27, 2019

Idea: expose the buffer for `fmt::Arguments`, so users can pass it around

Motivation

Proposed API

API Usage

Links

Related topics