Pre-RFC v2 - Static Function Argument Unpacking

This is the second version of my previous pre-RFC on Static Function Argument Unpacking. It's still quite rough around the edges, but the major points in the previous thread have been taken into account in this new proposal. Most notably, the proposed syntax has been changed to ...expr from ..expr, and unpacking of structs has been moved under future possibilities. Heartfelt thanks for those who submitted comments to the previous thread, and sorry for disappearing for half a year.

Edit 1: Replace a todo with a link to this IRLO thread.

Edit 2: Use … in example list and suggest its use in drawbacks. Mention postfix macros in alternatives. Link to C-variadic functions and to previous syntax for ..= for more references to .... Miscellaneous language and markdown tweaks and clarifications throughout.

Edit 3: Resolved the questions of references and mutability and expanded on them under subchapters "References and Mutability in Function Parameters" and "Assigning Automatic Reference Status or Mutability". Miscellaneous tweaks and fixes.

Edit 4: Solved the rest of the todos. Known remaining work includes changing to present tense ("would" -> "is/does") and other mostly language-related improvements.

Edit 5: Moved some paragraphs and subchapters to more logical places. Added further elaboration on some subjects.

Edit 6: Further polishing and reorganizing of the text.


Pre-RFC v2 - Static Function Argument Unpacking

  • Feature Name: static_fn_arg_unpacking
  • Start Date: 2024-10-xx
  • ...

Summary

This RFC adds call-site unpacking of tuples, tuple structs, and fixed-size arrays, using ...expr within the function call's parentheses as a shorthand for passing arguments. The full contents of these collections with known sizes are unpacked directly as the next arguments of a function call, desugaring into the corresponding element accesses during compilation.

Motivation

Argument unpacking reduces the verbosity and increases the ergonomics of Rust:

  • Improves code writing ergonomics by skipping repetitive, unneeded intermediate steps.
  • Allows more concise code in terms of number of lines and, occasionally, line length.
  • Allows reducing the number of named local variables in scope.
  • Is intuitive for developers accustomed to argument unpacking from other programming languages.
  • Adds a missing piece to the family of certain kind of syntactic sugar already in Rust, with features such as struct update syntax and destructuring assignment.

Furthermore, this proposal provides groundwork for both the syntax and its intended use for possible next steps and related proposals: As long as compatibility is sufficiently considered, the proposed feature could also reduce the workload and scope of more general and ambitious initiatives by splitting the work and iterating towards them in smaller steps. This may be a double-edged sword, however, as demonstrated under Drawbacks.

Guide-Level Explanation

Consider ellipsis, i.e. the three dots, ..., as a machine-readable shorthand for et cetera, used as a symbol for telling the compiler where to get the rest of the stuff from. The where part – written immediately after the ellipsis operator – is an expression that has a collection type with a size known during program compilation. This ...expr is used within the parentheses of a function call to enter items in the collection as arguments to the function being called. Passing the arguments automatically from the collection in such way is called argument unpacking.

The collection types that can be unpacked as arguments this way are tuples, tuple structs, and fixed-size arrays. Instead of first taking items out of a collection and then immediately passing them on, one by one, to a user of the items, argument unpacking streamlines this operation by allowing the developer to directly forward the items to where they are used. Argument unpacking applies to function, method, and closure calls, but it does not work with macro calls.

The types of the elements in the collection being unpacked, in the order in which they are in the collection, must be compatible with the next of the remaining function parameters being filled, i.e. function parameters that don't have an argument yet, and they must be known at compile time. Furthermore, the number of the elements may not exceed the number of the remaining unfilled parameter slots.

As a rule of thumb: For a collection with length n, if, inside a function call's parentheses, the collection's fields could currently be accessed manually, in order, with .0, .1, …, .(n-1) for tuples and tuple structs or with [0], [1], …, [n-1] for fixed-size arrays, entering each of the fields/elements using that syntax as arguments to consecutive parameter slots, unpacking as proposed here would also be valid.

Consider code with the following functions defined:

fn print_rgb(r: u8, g: u8, b: u8) {
    println!("r: {r}, g: {g}, b: {b}");
}

fn hex2rgb(hexcode: &str) -> (u8, u8, u8) {
    let r = u8::from_str_radix(&hexcode[1..3], 16).unwrap();
    let g = u8::from_str_radix(&hexcode[3..5], 16).unwrap();
    let b = u8::from_str_radix(&hexcode[5..7], 16).unwrap();
    (r, g, b)
}

Currently, to pass the output of hex2rgb to print_rgb, the return value of hex2rgb needs to be first stored as a local variable:

fn main() {
    // Store the result of `hex2rgb` and pass it to `print_rgb`:
    let rgb = hex2rgb("#123456");
    print_rgb(rgb.0, rgb.1, rgb.2);
}

or local variables:

fn main() {
    // Store the results of `hex2rgb` and pass them to `print_rgb`:
    let (r, g, b) = hex2rgb("#123456");
    print_rgb(r, g, b);
}

Whereas with argument unpacking as defined in this RFC, the intermediate step can be skipped:

fn main() {
    // Unpack the expression into function arguments:
    print_rgb(...hex2rgb("#123456"));
}

In the familiar context of instantiating structs, ..another_struct is the struct update syntax that automatically fills the remaining fields of the new struct from another_struct of the same type. In the likeness of that, when calling a function, argument unpacking as defined in this proposal allows automatically entering the next arguments into the function call from a collection whose elements match the next remaining function parameters.

Reference-Level Explanation

This RFC proposes a zero-cost abstraction to improve the ergonomics and readability of code related to function, method, and closure calling, specifically of passing of arguments. The feature is syntactic sugar commonly known as argument unpacking.

Scope of Planned Use

Argument unpacking works when calling any functions, methods, and closures that accept arguments. This is in contrast to some other programming languages that only allow unpacking arguments when the parameters of the function being called are named, variadic, positioned at the end of parameter list, or have default values. As tuple struct and tuple-like enum variant instantiations use the call expression, argument unpacking works on them too.

The scope of argument unpacking, fow now, is restricted to compile-time context during which the number, types, and order of appearance of the unpacked arguments are known. To all intents and purposes, the proposed form of argument unpacking is infallible at run time. Infallibility is not a part of the specification – rather, it's a side effect arising from the restricted scope of this proposal. Errors are thus prevented by the compiler rejecting scenarios that would not work.

This version of argument unpacking only affects:

  • Functions. Only function, method, and closure calls are affected. Tuple struct and tuple-like enum variant instantiations are affected as well. Macro calls are out of scope.
  • Call-site. The feature is only about argument unpacking, not parameter packing or variadic functions.
  • Compile-time context. Hence the word static in the RFC name. The feature is not about run-time behavior.
  • Provably successful situations. The collection types usable for the feature are selected to make the use of the proposed feature infallible.

Collections That Can Be Unpacked

Tuples, tuple structs, and fixed-size arrays can be unpacked. These collection types have a size known at compile time, and their elements have an unambiguous order, which allows the compiler to determine success.

Structs with named fields also have a size known at compile time, but instead of an unambiguous order, they have unambiguously named fields. A design that, for example, matches these field names with parameter names is intentionally left under Future Possibilities due to difficult to solve questions.

Syntax

Unary prefix ellipsis ... symbol, i.e. three consecutive ASCII dot characters, followed by an expression is proposed as the syntax for argument unpacking. The unpacking operator ... has a high precedence, forcing the developer to explicitly use parentheses or braces with any complicated expressions following it. The syntax is limited to be used only within the call parentheses.

This RFC proposes that argument unpacking can occur at any comma-separated location, in place of a conventional argument, in the function call and arbitrarily many times as well, as long as there are corresponding valid parameter slots left to pass the next arguments into. Basically, CallParams in Call expressions are modified to allow a comma-separated list of (argument unpacking OR expression). The order in which a function call's argument unpackings are desugared does not matter.

For example, if a function is defined with three parameters and it is called with argument unpacking of a 2-tuple and one conventional argument, the unpacked 2-tuple at the first parameter slot consumes the first and the second slots, and the conventional argument goes to the third slot.

Unpacking Rules

Unpacking of tuples, tuple structs, and fixed-size arrays is proposed in this RFC. Other collections are out of the scope. Whether unpacking is successful is checked during compilation, and unsuccessful attempts are rejected, having the side effect that this first version of the design is infallible during run-time.

Successful unpacking requires, that:

  1. All of the items inside the collection are unpacked.
    • For example, attempting to unpack a thousand-element array just to pass the first two elements as arguments to a function taking two parameters seems like a mistake that should be explicitly prevented.
  2. There must be at least as many unfilled parameters left in the function call as there are items inside the collection being unpacked.
    • This is a consequence of the above rule.
  3. Each item inside the collection is passed as an argument matching one parameter.
  4. The types of the items in the collection must be compatible with the corresponding parameters.
  5. If there are N items in the collection being unpacked, the immediately next N parameters in the function call are filled with the collection's items as the arguments.
  6. The order of the items in the collection is the same as the order in which they are unpacked.

Number of Arguments, Allowed Forms

For clarity, the following text refers to parameter slots, meaning the individual comma-separated places in a function call that each need to be filled by an argument for the function call to proceed.

As long as there are parameter slots remaining unfilled by arguments, filling them with arguments or by argument unpacking is allowed. Figure 1 illustrates how unpacking arguments from a collection of size n fills in the next n parameter slots of a function call.

// Function `takes_five` has five parameter slots p0–p4.
fn takes_five(p0: u8, p1: u8, p2: u8, p3: u8: p4: u8) {}

fn main() {
    let tup1 = (2,);
    let tup2a = (0, 1);
    let tup2b = (3, 4);
    let tup3 = (1, 2, 3);
    let tup4 = (0, 1, 2, 3);
    let tup5 = (0, 1, 2, 3, 4);
    
    // Each of the following function calls lead to the same result:
    takes_five(...tup5); // Parameter slots p0–p4 filled with values from `tup5`
    takes_five(...tup4, 4); // Parameter slots p0–p3 filled with values from `tup4`, p4 filled with literal 4
    takes_five(0, ...tup3, 4); // p0 and p4 filled with literals 0 and 4, respectively, and p1–p3 filled with values from `tup3`
    takes_five(...tup2a, 2, ...tup2b); // p2 filled with literal 2; p0–p1 and p3–p4, respectively, filled with values from `tup2a` and `tup2b`
    takes_five(...tup2a, ...tup1, ...tup2b); // p2 filled with the value from `tup1`; p0–p1 and p3–p4, respectively, filled with values from `tup2a` and `tup2b`
    
    // Let's take a closer look at the following call:
    takes_five(0, ...tup3, 4)
    // At call-site, there seem to be three comma-separated places, while the function has five parameter slots.
    
    // Desugared, the call above looks like this:
    takes_five(0, tup3.0, tup3.1, tup3.2, 4);
    //            ^^^^^^^^^^^^^^^^^^^^^^^
    // The whole tuple of three fields was unpacked into the underlined parameter slots.
}

Figure 1. Unpacking a collection of three items fills in as many consecutive parameter slots in a function call, starting from the slot it was defined in.

The function call syntax and the method-call expressions are changed in the following way:

  • CallParams is defined as (Expression | ...Expression) (, (Expression | ...Expression))*,?.

If the function has fewer unfilled parameters left than are being unpacked, compilation fails.

Provided that ultimately all parameter slots are filled with arguments, the function having more parameters than are being unpacked is a valid use-case, since unpacking always unpacks everything from the collection and fills the next of the remaining parameters. E.g.

set_color(my_alpha, ...get_rgb());

or

set_color(...get_rgb(), my_alpha);

Unpacking can occur multiple times within the same function call as well. E.g.

let diff = color_difference(...get_rgb(), ...another_rgb());

For example, all of the following are allowed, if the total number of arguments passed to function foo matches the number of parameters defined for function foo:

  • foo(...expr1)
  • foo(...expr1, arg1)
  • foo(...expr1, arg1, ...expr2)
  • foo(...expr1, arg1, ...expr2, arg2)
  • foo(...expr1, arg1, arg2, ...expr2)
  • foo(arg1)
  • foo(arg1, ...expr1)
  • foo(arg1, arg2, ...expr1)
  • foo(arg1, arg2, …, argN, ...expr1)
  • foo(arg1, ...expr1, ...expr2)
  • foo(arg1, arg2, ...expr1, ...expr2)
  • foo(arg1, arg2, …, argN, ...expr1, ...expr2, …, ...exprN)

Non-Trivial Cases

If, inside the call parentheses, the collection's fields can currently be accessed manually, in order, with .idx/[idx], entering each as arguments to consecutive parameter slots, unpacking is valid.

Generic Parameters

When function parameters are generic, using <T>, impl or dyn, exactly the same should happen as when the arguments are passed by hand. I.e., the argument's type must be compatible with the parameter's type. Just as when entering that argument manually with a field access expression.

References and Mutability in Function Parameters

Rust's current syntax for calling functions that take one parameter by value, by reference, or by mutable reference requires the developer to be explicit about what they want:

fn ret_one_arg() -> u8 {
    123
}

fn use_one_arg(p: u8) {
    println!("{p}");
}

fn use_one_refarg(p: &u8) {
    println!("{p}");
}

fn use_one_refmutarg(p: &mut u8) {
    println!("{p}");
}

fn main() {
    use_one_arg(ret_one_arg());
    use_one_refarg(&ret_one_arg()); // `&` needed to compile!
    use_one_refmutarg(&mut ret_one_arg()); // `&mut` needed to compile!
}

Rust also requires the developer to explicitly dereference references:

const CONST_NUMBER: u8 = 42;

fn ret_one_refarg() -> &'static u8 {
    &CONST_NUMBER
}

fn use_one_arg(p: u8) {
    println!("{p}");
}

fn use_one_refarg(p: &u8) {
    println!("{p}");
}

fn main() {
    use_one_arg(*ret_one_refarg()); // `*` needed to compile!
    use_one_refarg(ret_one_refarg());
}

Explicitly indicating varying degrees of (de)reference status or mutability on arguments being unpacked does not follow from the proposed syntax in any straightforward way. Thus, although it limits the usefulness of the feature, the design for such possibility is left out of the scope of this proposal. Consequently, the code will only compile if passing the arguments one by one with the corresponding field access expressions would compile.

Type Coercions of Collections

If the collection being unpacked is a reference for the collection type, whether argument unpacking works, depends on if accessing it directly with the element accessing functionality (.idx, or [idx]) would work at compile time. If it would, then argument unpacking should work. See type coercions and std::ops::Deref.

For example, the following will work, since the alternative works:

fn consume(a: u8, b: u8, c: u8) {
    println!("{a}, {b}, {c}");
}

fn main() {
    let tup = &(1, 2, 3);
    consume(...tup);
    // Alternative: consume(tup.0, tup.1, tup.2);
}

Corner Cases

Empty Collections

Minimum of one element/field is required in the collection being unpacked.

Attempting to unpack a unit tuple struct, the unit type, or an empty array is disallowed and an error is emitted. Unpacking them wouldn't make sense since there are no arguments to unpack. Attempting that would also violate the rule of thumb of argument unpacking being a sugared alternative for valid field access expressions, since none exist for an empty collection.

Diagnostics

  • Error: Attempt to unpack an expression that has zero items.

    • Note that collections with no items can't be unpacked.
  • Error: Attempt to pass the expression itself as an argument without unpacking it, if and only if the conditions that would allow argument unpacking are fulfilled.

    • Suggest refactor: Did you mean (same but with the unpacking syntax)?
  • Error: Attempt to unpack an expression where a specific element/field is incorrect (e.g. has the wrong type).

    • Point out the incorrect field by underlining it, telling what it incorrectly is, and what is expected instead.
  • Error: Attempt to unpack a slice, trait object, iterator, vector, or HashMap.

    • Note that fallible unpacking of Dynamically Sized Types is not supported.
  • Error: Attempt to unpack a struct instance whose fields are visible at call-site.

    • Note that structs cannot be unpacked.
  • Error: Attempt to unpack any other unpackable type.

    • Note that unpacking this type is not supported.
  • Lint: When directly unpacking arguments from an expression could be done instead of exhaustively using temporary variables that are not used elsewhere or accessing the elements/fields by hand.

    • Suggest refactor: Use unpacking instead.
  • Lint: When unnecessarily building a collection and unpacking that, e.g. passing ...(1, 2, 3) instead of 1, 2, 3.

    • Suggest refactor: Pass the arguments one by one instead of unpacking.
  • Lint: When unnecessarily unpacking a collection that has one item.

    • Suggest refactor: Pass the only value in the collection using the more explicit .0/[0] instead.

Guide/Documentation Changes

The Rust Reference:

Standard library documentation that may benefit from mentioning the new syntax:

Various Rust books would preferably teach the feature. For example, The Rust Programming Language book's Appendix B: Operators and Symbols could include the syntax.

Drawbacks

Functions that accept many parameters may already be a code smell, and the proposed change would likely help calling such functions the most, becoming an enabler for anti-patterns. At the same time, unpacking three of four arguments by hand is not much work, decreasing the usefulness of the change in normal code.

A sufficiently smart language server could automate argument unpacking, slightly decreasing the usefulness of having the feature in language itself when writing new code. However, there are many scenarios where a language server doesn't help, such as code examples in books.

Although the proposed syntax is familiar from other contexts, e.g. from other programming languages, it still burdens developers with additional syntax to understand. Possibly, depending on how intuitive the syntax is or how familiar the developer is with similar features from other programming languages, this may or may not imply an additional mental overhead when working with Rust code.

However, as the new syntax comes in the form of syntactic sugar, this shouldn't be so bad: no-one is forced to use this even though they may be forced to understand this when reading code written by others. Additionally, it could be reasonably argued that the proposed change makes the language a bit more consistent, since a similar feature for struct instantiation already exists. Anecdotally, the author of this RFC tried to use the syntax for the proposed feature only to notice it doesn't exist yet.

Any initiatives for the distinct features of named parameters, optional/variadic parameters, parameters with default values or combinations thereof will need to consider the corresponding proposals' interactions with argument unpacking. The selected syntax of ... will also be cemented to specific uses, possibly denying its use in some other contexts.

Ecosystem churn with MSRV (Minimal Supported Rust Version) bumps may be expected as some crate authors may decide to use argument unpacking in places where a workaround was previously used.

The ellipsis symbol composed from three consecutive ASCII dot characters is used in the "et cetera" or "and so on" sense in many design documents and code examples. Giving it an actual syntactical meaning could lead to some confusion or readability issues. Preferring …, i.e. the Unicode character U+2026, Horizontal Ellipsis, in those places could help.

Rationale and Alternatives

The supported use cases are limited to avoid problems that come with large scope; however, to help avoid metaphorical cul-de-sacs, i.e. incompatibilities with future features, some out-of-scope expansions are laid out under Future Possibilities.

Guiding principles in this design are:

  • Familiarity of syntax.
  • Compatibility with other features.
  • Zero-cost – this is just syntactic sugar for passing the arguments by hand.
  • Intuitiveness of use and the principle of least astonisment.
  • Avoiding ambiguity with simple rules and by requiring explicit control by the user (developer).

Some programming languages such as JavaScript and PHP use an ellipsis prefix, ..., as the syntax for a similar feature. Using this same syntax benefits inter-language consistency and familiarity for new users of Rust. There's an ongoing effort on variadic generics proposing a ... operator for unpacking in a compatible but wider setting than in this RFC.

Commonly, in other programming languages, the order in which the tokens appear is that inside the parentheses of a function call syntax, the collection to unpack the arguments from is prefixed by the symbol that is used for unpacking (e.g. ... or *). Thus, the same prefix order is proposed in this RFC. One notable exception to this is Julia, in which argument unpacking – known as splatting – is performed with f(args...).

Other Terms

Of the known term alternatives, argument unpacking can be hypothesized of being a strong contender in intuitiviness for general programmer audience: the name makes it clear that the feature relates to arguments, and unpacking seems a somewhat typical operation that can be performed on a collection in a neat and orderly fashion.

Several names amongst programming languages and programmer lingo refer to argument unpacking or to a similar feature. Various terms include, in alphabetical order:

  • deconstruction,
  • destructuring,
  • expanding,
  • exploding,
  • scattering,
  • splatting,
  • spreading,
  • unpacking.

Probably, developers experienced in a specific programming language are most familiar with the term used for the feature in that programming language.

People sometimes mistakenly conflate arguments and parameters. Selecting a term that is unlikely to feed into that confusion is a priority.

It is also worth pointing out that many Rust users have a non-English background. Thus, consulting the dictionary entries for the term alternatives before committing to a specific selection may be prudent.

Different Syntax

Some programming languages (e.g. Python and Ruby) use the asterisk * character in place of the proposed .... In Rust, such syntax would be confusing, since it's already used for dereferencing. Table 1. collects alternatives for the symbol and its place in the syntax.

Table 1. Operator symbol alternatives.

Place Operator Notes
Prefix ... The proposed syntax. Used already for C-variadic functions. Used in place of ..= for inclusive ranges previously. Used in JavaScript and PHP for argument unpacking.
Suffix ... Used in Julia for argument unpacking.
Prefix ...? Used in Dart as null-aware spread operator.
Prefix .. Used already for Functional Record Updates. Clashes with RangeTo<T>.
Prefix * Used in Python and Ruby for argument unpacking, in Kotlin for spreading. Clashes with dereferencing.
Prefix ** Used in Python for unpacking dictionaries into named keyword parameters.
Prefix @ At. Emphasis on where the arguments come from. Used in PowerShell for splatting.
Prefix ^ Connotation of "up and out". Used for XOR in binary contexts.
Prefix ~ Connotation of "inverting the collection inside-out", or C++ destructors.

Alternative Syntax of .. Prefix

Functional Record Updates (i.e., Struct Update Syntax) already allow automatically filling fields when instantiating structs. This RFC recognizes the likeness of this feature with the proposed argument unpacking and treats them as belonging to the same family of syntactic sugar. Using similar syntax is tempting. However, adopting its current syntax of a .. prefix would clash with RangeTo<T>'s syntactic sugar.

Inside struct instantiation braces is a parsing context with a comma-separated list of { field_name: expr, field_name: expr, ... }, where instead of field_name: expr, the alternative ..other_struct is allowed, i.e. ..expr by itself is not a valid item, mitigating the clash with RangeTo<T> in Functional Record Updates. However, inside function call parentheses is a parsing context of ( expr, expr, ... ) comma-separated list of expressions (which this RFC proposes to change). As ..expr is already itself a valid expression, producing a RangeTo<T>, some additional workaround would need to be designed to overcome the possible breakage from a syntax change.

For example, ..(1, 2, 3) is valid syntax producing a RangeTo<(u8, u8, u8)>. More generally, ..expr works for any type T emitted by expr, producing a RangeTo<T>.

For consistency, if ..expr were selected for argument unpacking, the argument unpacking syntax could be favored and take precedence instead of the RangeTo<T> sugar. Conversely, if a developer actually wants to pass a constructed-on-the-fly-using-syntactic-sugar RangeTo<T> argument, it could be wrapped inside braces or parentheses: i.e. {..expr} or (..expr). As this change in syntax would be a breaking change, it could be stabilized in the next edition.

Implement a Smaller Subset

Aside from not implementing the proposed change at all, some smaller but still useful subset of it could be implemented instead. For instance, the scope could be further restricted by only allowing unpacking

  • of tuples,
  • once in the function call at the end of the argument list,
  • once in the function call without additional normal arguments, or
  • when the function has variadic parameters, if such feature is implemented.

Limit to Unpacking Tuples

Tuples already look very much like argument lists, being a comma-separated list of items in parentheses.

Limit to Unpacking at End of Argument List

Other programming languages where argument unpacking is restricted such that it is only allowed at the end of argument list seem to do it for reasons connected with variadic/optional, named, or with-default-value parameters.

Limit to Unpacking Xor Conventional Arguments

Limiting function calls to have either use of conventional arguments or argument unpacking seems, superficially, a bit arbitrary. It is hard to come up with a good technical explanation it, other than perhaps it being simpler to implement.

Limit to Unpacking into Variadic Parameter Slots

Limiting unpacking to only work with variadic parameter slots may be a natural consequence of a more variadic parameter centric approach, with less thought put into argument unpacking.

Give Macros More Control over Arguments

Empowering macros with new features would avoid new syntax.

Argument unpacking would follow naturally as a part of a more ambitious initiative of treating function argument places as distinct tokens accessible by macros. In Lua, the function table.unpack seems to superficially lead to the same result. An example of how this might look like in Rust:

fn main() {
    // Turns (u8, u8, u8) into three u8 arguments in the function call
    print_rgb(to_args!(hex2rgb("#123456")));
}

If postfix macros are implemented, the following could be done instead:

fn main() {
    // Turns (u8, u8, u8) into three u8 arguments in the function call
    print_rgb.call_with_args!(hex2rgb("#123456"));
}

How, or if, multiple collections could be unpacked, possibly along with other, conventional arguments, would need further design.

One obvious downside with these approaches would be including another macro in std; including the macro in a separate external crate via the ecosystem could be done as a workaround, but the cost-to-benefit ratio of including another dependency may not make it worth it for some users.

Capturing Arguments from the Variables in the Current Scope

A somewhat different design, allowing the use of bare ... as a shorthand for passing variables in the current scope as arguments in the function call, would still make code shorter. Technically, this wouldn't conflict with the design proposed in this RFC. However, having two different but syntactically similar shorthands for functionality resembling each other might be confusing, which may be a reason to only commit to one or the other. Closures already capture the environment, and this would be similar in that the parameters with matching names would be filled with arguments. A downside in this approach would be that tracking what actually goes into the function becomes harder, and changes to variable names within the calling function's scope could make this approach error-prone. This approach would also have the problems related to exposing parameter names in the public API as described below for the future possibility of (unpacking structs)(#unpacking-structs).

Workarounds If RFC Is Not Implemented

Unpack Tuples into Arguments with fn_traits

Instead of changing the language to include the syntactic sugar, a standard library method from the fn_traits feature could be used. A slightly more verbose example:

fn main() {
    std::ops::Fn::call(&print_rgb, hex2rgb("#123456"));
}

One downside of this is that the syntax diverges from a normal function call, i.e. superficially, the code seems to be calling call, with the actual function to be called being just one argument. Given the verbosity and unfamiliar syntax compared to argument unpacking in other programming languages, this option also doesn't increase ergonomics that much. Relying on this might also confuse language servers when trying to locate uses of the called function. Directly unpacking tuple structs or fixed-size arrays isn't supported either, although .into() can be called on the latter. This does not work directly with methods either – extra work is required to call the associated function using fully qualified syntax and add &self to the beginning of the tuple, which is impractical if the tuple comes from a return value as in the provided example.

Refactor the Callee

Another simple way to avoid the verbosity of having to pass the arguments in a collection by hand is to change the type signature of the function being called to accept the tuple/tuple struct/array instead. In some cases, defining the function parameters as patterns can be useful. For example, the callee can be refactored to accept a tuple instead:

fn print_rgb((r, g, b): (u8, u8, u8)) {
    println!("r: {r}, g: {g}, b: {b}");
}

and calling it simply becomes:

print_rgb(hex2rgb("#123456"));

There is no flexibility in accepting arguments one by one, but instead, a tuple must be constructed at the call site if passing single arguments is needed. Refactoring also is not always possible, for example if the function is in a 3rd party crate. In these cases it's possible to manually implement a wrapper for the 3rd party function.

Prior Art

Argument Unpacking in Different Programming Languages

Python has argument unpacking, (also see subchapter Calls under Expressions in Python Language Reference) which allows using the * or ** operators at call site to, respectively, extract values from tuples or dictionaries into distinct arguments.

The same example as in Guide-Level Explanation, but implemented in Python:

def print_rgb(r: int, g: int, b: int) -> None:
    print(f"r: {r}, g: {g}, b: {b}")

def hex2rgb(hexcode: str) -> tuple[int, int, int]:
    r = int(hexcode[1:3], 16)
    g = int(hexcode[3:5], 16)
    b = int(hexcode[5:7], 16)
    return r, g, b

if __name__ == "__main__":
    print_rgb(*hex2rgb("#123456"))

A related Python feature, packing of the parameters, is unrelated to this proposal and connected to the distinct concept of variadic functions. However, as it uses the same syntax in different context (function definition) as it uses for argument unpacking, it's worth mentioning as an example of how different programming languages may reuse the syntax.

Crystal has Splatting.

  • Syntax: sum_of_three *numbers

Dart has Spread Collections.

  • Used for inserting multiple elements into a collection.

JavaScript has Spread Syntax.

  • Syntax: sum_of_three(...numbers)

Julia has Splat.

  • Syntax: sum_of_three(numbers...)

Kotlin has Spread operator.

  • Syntax: sum_of_three(*numbers)

Lisps have apply (for example, Clojure, Emacs Lisp, Racket, Scheme).

  • Syntax: (apply 'sum_of_three numbers)

Lua has the function table.unpack.

  • Syntax: sum_of_three(table.unpack(numbers))

PHP has Argument Unpacking (see PHP RFC and its mailing list discussion).

  • Syntax: sum_of_three(...$numbers)

PowerShell has Splatting.

  • Syntax: sum_of_three @numbers

Ruby has the Splat operator.

  • Syntax: sum_of_three(*numbers)

Haskell has no separate syntactic sugar for argument unpacking, but various uncurryN functions can be implemented, where N is the number of items in a tuple, e.g.:

uncurry3 :: (a -> b -> c -> d) -> (a, b, c) -> d
uncurry3 f (a, b, c) = f a b c

Notable Differences to Existing Implementations

For example, in Python, fallible unpacking occurs dynamically, at run time. Use cases, such as unpacking data structures created at run time with varying number of elements, are supported. On the other hand, whether unpacking can happen at all is not known until it is attempted during program execution, resulting to errors such as the following, when attempting something that wouldn't work:

TypeError: print_rgb() takes 3 positional arguments but 4 were given

The proposed feature in this RFC is different, only allowing unpacking when it is proven to succeed during compilation, marking it static. Consequently, it also makes the feature infallible.

Ellipsis in Rust

The three ASCII dots syntax is already for C-variadic functions.

Previously, ellipsis was used as syntax for inclusive ranges, i.e. in place of ..=.

Use of Ellipsis in Different Programming Languages

The ellipsis syntax is also used for features other than argument unpacking. E.g., C++ uses ellipsis suffix for Pack expansion.

Existing Rust Work on Subject

Specifically on Argument Unpacking

Rust Internals:

Stack Overflow questions:

Unpacking Arrays

Rust Internals:

Using Tuples in Place of Argument Lists

Rust GitHub:

Rust Zulip t-lang:

Rust Users Forum:

Unpacking Structs

Rust Internals:

Rust Users Forum:

Other Related

On variadic generics in general: This might in the end solve the same problem along with many others as well. Variadic generics is a more ambitious feature with a large design space, and the design progress seems to have been ongoing since 2014. Meanwhile, argument unpacking essentially provides a subset of the consequences of variadic generics designs seen so far.

Rust GitHub:

Rust Internals:

Unresolved Questions

  • Should argument unpacking be desugar into Alternative A or Alternative B below, or does it make any difference?

    Alternative A:

    let tup = (1, 2, 3);
    
    // callee_fn(...tup); desugars into:
    {
        let _tmp0 = tup.0;
        let _tmp1 = tup.1;
        let _tmp2 = tup.2;
        callee_fn(_tmp0, _tmp1, _tmp2);
    }
    

    Alternative B:

    let tup = (1, 2, 3);
    
    // callee_fn(...tup); desugars into:
    callee_fn(tup.0, tup.1, tup.2);
    

Future Possibilities

Future RFCs should freely venture outside the scope of this RFC and complement or build on this limited form of argument unpacking.

Unpacking in Fixed-Size Array and Tuple Literals

The same ellipsis syntax with a very similar meaning could be adopted to defining fixed-size arrays and tuple literals as well. For example:

const CDE: [char; 3] = ['C', 'D', 'E'];
const ABCDEFG1: [char; 7] = ['A', 'B', ...CDE, 'F', 'G'];
const ABCDEFG2: [char; 7] = ['A', 'B', CDE[0], CDE[1], CDE[2], 'F', 'G'];

assert_eq(ABCDEFG1, ABCDEFG2);

At least Python and JavaScript have this feature.

Notably, tuple structs can already be built with the adoption of argument unpacking in general, since their constructors use the call expression that is modified by the addition of argument unpacking.

Prior work on designing such feature exists at least in the Rust Internals thread Pre-RFC: Array expansion syntax.

Assigning Automatic Reference Status or Mutability

Technically, during argument unpacking, it's possible to automatically assign varying degrees of (de)reference status or mutability such that code compiles. The following trivial code could most likely be inferred by the compiler in a way that it would compile successfully, for example:

fn create_collection() -> (u8, u8, u8) {
    (1, 2, 3)
}

fn consume(a: u8, b: &u8, c: &mut u8) {
    *c = a + b;
}

fn main() {
    // consume(...create_collection()); desugars to:
    {
        let (_tmp0, _tmp1, mut _tmp2) = create_collection();
        consume(_tmp0, &_tmp1, &mut _tmp2);
    }
}

Further specification of automatically fitting argument unpacking to the reference or mutability status of the parameters in the function being called would merit a separate RFC.

Unpacking Arguments for Macro Calls

Macros, callable with the macro_name!(...) syntax have been omitted from the scope of this proposal. The only reason for omission is the time concerns related to doing the due diligence researching the differences in design. For example, some macros (e.g. println!) accept an indefinite number of arguments. Unpacking tuples, tuple structs, and fixed-size arrays probably still makes sense – after all, argument unpacking is only syntactic sugar for something that can be done already in the desugared form. Further design, meriting a separate RFC amending this one, is needed.

Unpacking Struct with Named Fields as Arguments

Unpacking structs with named fields has been omitted as well. Although a design where the struct's field names are unpacked as the arguments, provided that the types are compatible, to the correspondingly named parameters could be considered, there are major concerns in this design related to API guarantees and the current lack thereof regarding function parameter naming: If a function in crate A is used such that at call-site, in crate B, its parameters are filled with arguments that were unpacked from correspondingly named struct fields, a major version bump would be required for crate A to prevent a Semantic Versioning violation in crate B from refusing to compile if parameter names in the function in crate A are changed. Currently, since the parameter names don't have an effect at the user's side, such change can be made with a patch version bump. Essentially, allowing name-based matching of struct fields to parameters would introduce parameter names as part of the public API of libraries.

To support future work on unpacking structs, an opt-in attribute that declares a function's parameter's name as stable could be considered. This could unlock other possibilities related to argument unpacking as well, for example, overriding an argument that was already unpacked by explicitly using a named argument for it, after unpacking.

Another aspect to consider could be introducing a #[derive]able trait for structs, allowing them to be unpacked in the field to parameter name fashion.

Sketch of Unpacking Structs

It may be important to give this some thought before accepting any argument unpacking rules whatsover. The reason is that if the unpacking of structs is seen as an actual future possibility, we wouldn't want to introduce rules that are incompatible. Importantly, the design space has some notable overlap with another future possibility described below: fallible run-time unpacking of, e.g. HashMaps.

The basic idea when unpacking structs could be to match the struct's field names with the called function's parameter names. Some rules can already be thought of:

  • If unpacking a struct with the exactly named fields, the order of the struct's fields vis-Γ -vis the arguments doesn't matter. Just pass the struct fields as the correspondingly named parameters.
  • The struct fields need to be visible at call-site, e.g. pub or pub(crate).
  • Attempting to unpack a struct with named fields, where the number and types of fields match, but the names are different, is rejected.
    • Technically, under certain circumstances, it would be possible to emit syntactically correct code from the sugar, but the motivation is ambiguous. Therefore, it's better to leave it up to the developer to decide what is it that they want to accomplish.
    • Also, it's difficult to specify what would happen when there are multiple arguments of the same type: What should the order be when the names don't match? What would happen if one of the struct's fields was renamed into one of the parameter names?

However, several unresolved questions when unpacking structs would need to be considered as well:

  • What to do when unpacking structs with named fields into macro call arguments?
  • How does unpacking more than one struct work?
  • How to reconcile trait methods having differently named parameters?

Fallible Runtime Unpacking of Dynamic Collections

The scope of argument unpacking could be expanded to dynamic contexts as well. Runtime unpacking of dyn Trait trait objects, slices, Vecs, HashMaps, iterators in general etc. would be fallible, since the existance of a correct number, order, typing and naming of items to match the parameters can't be guaranteed at compile time. A syntax such as ...expr? or ...?expr could be considered to improve ergonomics of argument passing for those cases as well, but that would definitely merit a separate RFC.

Possibly, this would involve an stdlib trait, e.g. TryArgUnpack, whose implementation the language would use to get the arguments. This would enable unpacking custom collections as well.

Syntax Change for Functional Record Updates

Adopting the ...expr syntax for argument unpacking means that it is now part of the general "take stuff from here and expand it" family of syntactic sugar. As Rust already uses the .. syntax for Functional Record Updates, changing that to use an ellipsis instead would be congruent.

Comments in the first pre-RFC thread suggest that the specific way the Functional Record Updates feature is currently implemented, syntax-wise, has a mixed community appraisal.

I'd like to see this act as a simpler subset of something like @Jules-Bertholet's variadic generics design sketch; I don't see any obvious conflicts between what you're written and that design, but I'd be grateful if you could confirm that in a world where that sketch or something similar was implemented, your design just falls out of the bottom as an obvious corollary of the variadics design.

I can confirm this, yes. It's actually something I had in mind while writing this pre-RFC – iterating with smaller steps can be a bit faster, as those steps may be easier for everyone to grasp.

I'd be happy to improve anything in this pre-RFC to further accommodate its compatibility with a/the variadic generics design.

2 Likes

It might be useful to denote omissions with the unicode ellispsis U+2026 … in the examples for the allowed forms, i.e. foo(arg1, arg2, …, argN, ...expr1) instead of foo(arg1, arg2, ..., argN, ...expr1). It took me some time to parse the cases. There are a lot of ellipsis there :wink:

4 Likes

Overlap with the range syntax is unfortunate. Rust has barely finished migrating from ... to ..=, because the difference between .. and ... was too subtle.

This is a place where postfix macros could be used without need for a bespoke syntax:

print_rgb.tuple_args!(hex2rgb("#123456"))
2 Likes

Rust has used ... for C variadic functionality for the last seven years. If/when it gets stabilized, it would be internally consistent.

5 Likes

Thanks, I definitely agree. :smiley: I edited the document thus, and also included this as a suggestion in the drawbacks chapter. On a related note, I should rewrite, possibly autogenerate, that example list…

Good point! Added some references to that in syntax alternatives table.

Added an adaptation of this as well to the macro alternative.

Thanks, this is a very good reference – I added a link to the syntax alternative table.

2 Likes

I've made some edits to the pre-RFC text and am confident that it's getting closer to the state where it could be submitted as an actual RFC. I'll wait a while before that, though, if the community has more comments.

Thanks to everyone who's contributed to the discussion so far. :slight_smile:

Another edit with more polish.

I guess I shouldn't make predictions about the text being close to its final state until I can't come up with changes anymore. :smiley:

I have now published the RFC on rust-lang GitHub: [RFC] Static Function Argument Unpacking by miikkas Β· Pull Request #3723 Β· rust-lang/rfcs Β· GitHub

Thanks for all the comments in both this and the previous Pre-RFC thread!