Pre-RFC: Auto-tupling at call-sites

pnkfelix · July 16, 2014, 1:05pm

Summary

Automatically turn excess arguments at a given call site into a tuple of the final argument and the excess arguments. Automatically turn an omitted argument at a given call site into unit (()). The prior two transformations (the combination of which I am calling “auto-tupling”) can be used in tandem with the trait system as a way to express optional parameters and multiple-arity functions.

Motivation

People have been asking for optional arguments for a while,

On the mailing list: Polymorphism & default parameters in rust mail.mozilla.org/pipermail/rust-dev/2012-August/002228.html
On the rust repo: Default arguments and keyword arguments rust-lang/rust/issues/6973
On the RFC repo: optional parameters https://github.com/rust-lang/rfcs/pull/152, Arity-based parameter overloading https://github.com/rust-lang/rfcs/pull/153

Auto-tupling at the call site provides a clean syntax for defining functions that support a variety of calling protocols: you make the last argument for the function a trait, and then implement the trait for every combination of tuple that you want to support.

This strategy supports optional arguments and arity-based overloading for statically-dispatched call sites.

At the same time, it is a relatively simple change to the language: nothing changes about function definitions nor about the calling convention; it is just a local transformation on each call-site where the number of actual arguments does not match the number of formal parameters.

The expected outcome is that we reap many of the benefits already associated with optional arguments and arity-based overloading, assuming that the standard library is revised to make full use of the feature.

Detailed design

For any function F, if the following two conditions hold for its definition:

F is defined as taking k+1 arguments, and
F where the final formal argument to the function is some generic type parameter,

then at all of the call sites for F, it can be passed any number of arguments >= k.

When F is passed k arguments, then the missing final k+1'th argument is automatically inserted as the unit value ().

When F is passed k+1 arguments, then everything operates the same as today (i.e. this RFC has no effect on it).

When F is passed k+j arguments for j > 1, then the final j arguments are converted into a tuple of length j.

The rest of the compilation procedes as normal.

In the common case, the final argument to F will have one or more trait bounds, and the call sites will be expected to pass a set of arguments whose auto-tupling is compatible with those trait bounds. That is how we get all the way to enforcing a strict protocol on what the optional arguments are, or what multiple arities of F are.

Note: The strategy of this RFC does not work for closures and dynamic dispatch because closures are monomorphic and object methods cannot have generic type parameters. I deem this an acceptable price to pay to keep the language change simple: (In general, supporting a combination of optional arguments and dynamic dispatch would require some way of communicating the type and number of parameters from the call-site to the method definition.)

As a concrete example, assume the following definition (where nothing new from this RFC is being used):

fn foo<T:FooArgs>(required_x: int, rest: T) -> int {
    required_x + rest.y() + rest.z()
}

trait FooArgs {
    fn y(&self) -> int;
    fn z(&self) -> int;
}

impl FooArgs for () {
    fn y(&self) -> int { 0 }
    fn z(&self) -> int { 0 }
}

impl FooArgs for int {
    fn y(&self) -> int { *self }
    fn z(&self) -> int { 0 }
}

impl FooArgs for (int, int) {
    fn y(&self) -> int { self.val0() }
    fn z(&self) -> int { self.val1() }
}

Under this RFC, here are some legal expressions:

foo(1)       // expands to foo(1, ()), evaluates to 1
foo(1, 2)    // expands to foo(1, 2), evaluates to 3
foo(1, 2, 3) // expands to foo(1, (2, 3)), evaluates to 6

This illustrates how one expresses optional arguments for foo under this RFC.

As another example, the GLM library for C++ defines vec2/vec3/vec4 structures that define vectors of 2/3/4 numeric components, respectively. The constructors provided in GLM for vecN (for N in {2,3,4}) include both a unary and N-ary variant: the unary variant copies its input argument to all N members, and the N-ary variant copies each of the inputs to the corresponding member.

Without this RFC, one can emulate this in Rust via tuples:

fn vec4<A:Vec4Args>(a: A) -> Vec4 {
    Vec4{ x: a.x(), y: a.y(), z: a.z(), w: a.w() }
}

impl Vec4Args for f32 {
    fn x(&self) -> f32 { *self }
    fn y(&self) -> f32 { *self }
    fn z(&self) -> f32 { *self }
    fn w(&self) -> f32 { *self }
}

impl Vec4Args for (f32,f32,f32,f32) {
    fn x(&self) -> f32 { self.val1() }
    fn y(&self) -> f32 { self.val2() }
    fn z(&self) -> f32 { self.val3() }
    fn w(&self) -> f32 { self.val0() }
}

vec4(9.0f32)                           // ==> Vec4{ x: 9.0, y: 9.0, z: 9.0, w: 9.0 }
vec4((1.0f32, 2.0f32, 3.0f32, 4.0f32)) // ==> Vec4{ x: 1.0, y: 2.0, z: 3.0, w: 4.0 }

But with this RFC in place, the syntax for the last line becomes a bit nicer:

vec4(1.0f32, 2.0f32, 3.0f32, 4.0f32)   // ==> Vec4{ x: 1.0, y: 2.0, z: 3.0, w: 4.0 }

The two examples above followed a general rule of treating the trait as a bundle of all of the remaining arguments. However, the scheme of this RFC can also express multiple-arity dispatch, where one may want a function to have two totally different behaviors depending on the arguments passed at the call-site. The way you do this: just make the trait implementation itself hold the bulk of the function’s behavior, rather than the function body, which just dispatches off to the trait.

So as an example:

fn print_report<P:ReportPrinter>(report: &Report, output: P) {
    output.print_it(report)
}

impl ReportPrinter for () {
    fn print_it(&self) { /* just print to stdout */ }
}

impl ReportPrinter for std::io::File {
    fn print_it(&self) { /* print to the file*/ }
}

struct Verbose;
impl ReportPrinter for (Verbose, std::io::File) {
    fn print_it(&self) { /* print to the file, with verbose content */ }
}

impl ReportPrinter for gui::Window {
    fn print_it(&self) { /* print to a text area in the window */ }
}

The design philosophy espoused by this RFC allows for client code to add new instances of the arguments trait. As a concrete example, in the previous example of ReportPrinter, its entirely possible that the code for impl ReportPrinter for gui::Window lives in the crate that defines gui::Window, rather than the crate that defines fn print_report. (Of course it falls upon the author of the ReportPrinter trait to document its API well-enough to support such usage, if that is desired.)

Drawbacks

Some people may prefer explicit sugar on the function definition to indicate optional arguments and/or argument-based dispatch, rather than indirectly expressing it via a trait. So adopting auto-tupling may not satisfy such persons’ desire for so-called “true” optional arguments.
As a concrete example of why one might prefer baked-in support: rustdoc would not show you the various potential arguments with which one might invoke the function.
Auto-tupling may delay the reporting of legitimate errors. Reporting errors as eagerly as possible is the reason I included the condition that the final formal argument to the function be some generic type parameter, but obviously that still does not immediately catch the case where one e.g. invokes vec4(1.0f32, 2.0f32), which would expand into vec4((1.0f32, 2.0f32)) and lead to an error like: “error: failed to find an implementation of trait Vec4Args for (f32,f32)”; presumably the rustc compiler can be adapted to report a better error message when a tuple has been introduced by auto-tupling.
Maybe we are already pushing our traits to their limit and should not be attempting to use them to express a feature like this.
Support for auto-tupling steals away other potential uses for excess arguments.
- (E.g. I think somewhere else in discuss.rust-lang.org someone has proposed desugaring excess arguments into a curried function application, f(x,y) ==> f(x)(y). I think auto-tupling is more “rustic” than auto-currying, but my ears are open to arguments for why currying is preferable.)

Alternatives

We can choose to not add any support for optional arguments at all. We have been getting by without them. (But I think the sugar proposed in this RFC is pretty lightweight.)

We can add a more complex protocol for supporting optional arguments that includes changes at the function definition site (and potentially the calling convention, depending on how extreme you want to be). The main reason I could see for going down that path is to support optional arguments on closures and object methods.

Unresolved questions

None yet.

cmr · July 16, 2014, 5:07pm

The case of k+1 is strangely inconsistent, in that it uses a single value rather than a one-tuple, (foo,). I imagine the motivation for this is that you don’t want to have to implement the trait for one-tuples. However, it seems harmless to me to introduce a coercion between types and one-tuples containing those types. I haven’t thought through all of the ramifications of it. I think patching the inconsistency could be worth it.

On the other hand, this auto-tupling seems like it goes along very well with eddyb’s “variadic generics” strawman proposal. It is also a very small sugar adding no power to the language and interacts with no other features, which I like.

pnkfelix · July 16, 2014, 5:17pm

The k+1 case uses a single value in order to satisfy two goals: (1.) be backwards-compatible with code from today, where no auto-tupling occurs, and (2.) keep the desugaring “local” (no knowledge of trait impls required) and simple to understand. (Note that the function is defined as taking k+1 arguments, not k arguments.)

Lets say that the final formal argument is of type T with trait bound FnArgs, and the actual k+1’st argument at the call-site is int. If we were to apply auto-tupling to create a singleton tuple in the k+1, then either we break code from today, or we have to speculatively try to find an implementation of FnArgs for int and for (int,). (And more generally, if the k+1’st argument is U, then we’d have to try to find an implementation of FnArgs for U and (U,), which is a little funky when U is itself a singleton tuple like (int,).

(Oh, I just noticed your sentence about introducing a coercion between types and one-tuples containing those types. . . indeed, it would be nice if we might figure out some way to make a one-tuple (T,) synonymous with its content T, which would be a more elegant to retain backwards compatibility here, but I do not know if that is actually feasible.)

iopq · July 16, 2014, 10:34pm

What do you do when a parameter is completely optional and the code path doesn’t reference it if it’s not there?

Let’s say foo is defined as taking 1 parameter and using it if it’s there. If it’s not there, the parameter doesn’t have a default value, it just doesn’t use that code path where the parameter is referenced.

In JavaScript you would just do a runtime check for undefined and an if statement, but in C++ you would overload the method so that there’s two versions of it with two different codepaths. The C++ approach is more efficient since it dispatches at compile time.

pnkfelix · July 17, 2014, 5:03am

The implementer of foo would decide which of the two strategies (“Javascript” versus “C++”) to use. The two strategies match up with the distinction drawn in the RFC between “optional argument” style and “multiple-arity dispatch” style.

I will illustrate here concretely:

JavaScript style:

trait Foo2ArgsJS { fn x(&self) -> Option<int> }
fn foo2<A:Foo2ArgsJS>(args: A) -> String {
  let x = args.x();
  ...
  // Here, you do the runtime check (though in practice for the
  // impl's given below, since monomorphization creates two
  // copies of the code, I am pretty confident it would turn into
  // just two copies of the code with the check optimized away).
  if x.is_none() { ... } else { ... }
  ...
}

impl Foo2ArgsJS for () {
  fn x(&self) -> Option<int> { None }
}
impl Foo2ArgsJS for int {
  fn x(&self) -> Option<int> { Some(*self) }
}

C++ style:

trait Foo2ArgsCXX { fn body(&self) -> String }
fn foo2<A:Foo2ArgsCXX>(args: A) {
  args.body()
}

impl Foo2ArgsCXX for () {
  fn x(&self) -> String {
    ... /* body when param missing here */ ...
  }
}
impl Foo2ArgsCXX for int {
  fn x(&self) -> String {
    ... /* body when param present here */ ...
  }
}

Update: renamed foo in this comment to foo2 in order to disambiguate it from the foo in the RFC.

iopq · July 17, 2014, 5:15am

Yeah, but do you have to call it like foo( () ) when you want to use the option without any arguments? Or do you just go foo() and it puts unit as the argument list?

pnkfelix · July 17, 2014, 11:47am

The fn foo2 function in my comment above takes a single formal argument; so when looking at the rules from the Detailed Design in the RFC, the appropriate choice for k is 0 (since 1 == 0+1).

Therefore, in both the Javascript and C++ styles above, the calls can be written as follows:

// F is passed k = 0 actual arguments;
// ==> expands to foo2( () ) by the rules of the RFC
foo2();

// F is passed k+1 = 1 actual argument
// ==> does not expand by the rules of the RFC
foo2(3);

Handling this case is precisely why the Detailed Design is written in the funny way of talking about a function taking k+1 formal arguments (rather than talking about n formal arguments and then having to referring to cases where the number of actual arguments is n-1).

huon · July 17, 2014, 12:18pm

I'm not so sure about this special case, but I actually think it's the only thing that works (unless there is the coercion T ⇆ (T,) that @cmr mentions):

If you have

fn foo<T: Overloads>(x: T) { bar(1, x) }
fn bar<T: Overloads>(x: i32, y: T) { ... }

then foo(1, 2) is effectively setting x == (1, 2) (right?), and thus, if there wasn't the single-argument special case, bar would be being called with ((1, 2), ), not with (1, 2); you'd effectively need to create some new way to directly pass arguments through.

It would link to the trait, though, right?

I wonder if

there should be some marking on the type parameter for which enables this, to make it obvious from the source.
rustdoc could detect the conditions underwhich the arity generalisation can work, and thus render the function differently, and
if this could be restricted to generics with a single bound of a specially marked trait, something like
```
  #[arity_tupling] trait FooArgs { ... }

  fn foo<T: FooArgs>(x: T) { ... }
```
Furthermore, such a trait could be restricted to being implemented only on tuples (this restriction would have to be rather... careful with the one-argument case, since the impl would have to be impl FooArgs for (T,), but, as described above, the call site needs to be handling it "unboxed"... maybe this impl-only-for-tuples restriction isn't such a good idea.)

pnkfelix · July 17, 2014, 2:34pm

Eh, I would not be too psyched about limiting the traits to solely being implemented on tuples.

I opted not to get into it in the RFC itself, but from my POV, part of the advantage of this approach is that if the function designer makes a function that takes optional arguments, the client code can decide "I prefer to see names next to my arguments. I'm going to make a struct instead of a tuple for the Args trait."

To make this really concrete: one crate provides the foo function from the RFC:

fn foo<T:FooArgs>(required_x: int, rest: T) -> int { ... }
trait FooArgs { fn y(&self) -> int; fn z(&self) -> int; }

but the client crate can say "Fooie on your API":

struct NamedFooArgs { y: int, z: int }
impl FooArgs for NamedFooArgs { ... }
fn client() {
    use Args = self::NamedFooArgs;
    ...
    ... foo(3, Args{ y: 4, z: 5 }) ...
}

(But I didn't want the RFC to get diverted into a discussion about whether this is good or bad practice, so I left it out of the RFC body itself. I guess I would appreciate feedback on whether I should incorporate it into the examples.)

(but the other notes/suggestions from @huon seem fine to me.)

Note to @huon:

Just a request for clarification: you are describing merely the final argument that bar is being called with. The function bar in your example is being called with bar(1, x), which expands in its entirety to either: bar(1, (1,2)) (as specified by the RFC) or to: bar(1, ((1, 2),)) (if one removed the single-argument special case).

I just point this out because someone unfamiliar with the RFC might misinterpret what you are saying as it you mean the call bar(1,x) is somehow turned into either the nonsensical bar(((1,2),)) or the potentially sensible (but incorrect in this context): bar(1,2).

erickt · July 17, 2014, 8:55pm

@pnkfelix: I have to admit I think the auto-tupling rules feels a bit too magical for me, but I do like the overall spirit of this. This makes me wonder about going the other way, and using O’Caml’s curry-able function argument list style to support overloading. I think it could work pretty well with traits:

fn foo<T: FooArgs> (required_x: int) T -> int {
    required_x + t.y() + t.z()
}

...

Then generic calls would be:

foo 1 ()
foo 1 2;
foo 1 (2, 3);

This though is probably too radical of a change to consider at this point.

Perhaps a more practical approach would be to provide for more sugar for the builder pattern. Right now we can do:

fn foo(required_x: int) -> Foo1 {
    Foo1 { required_x: required_x }
}

struct Foo1 {
    required_x: int,
}

impl Foo1 {
    fn y(self, y: int) -> Foo2 {
        Foo2 { foo1: self, y: y }
    }

    fn call(self) -> int {
        self.y(0).z(0).call()
    }
}

struct Foo2 {
    foo1: Foo1,
    y: int,
}

impl Foo2 {
    fn z(self, z: int) -> Foo3 {
        Foo3 { foo2: self, z: z }
    }

    fn call(self) -> int {
        self.z(0).call()
    }
}

struct Foo3 {
    foo2: Foo2,
    z: int,
}

impl Foo3 {
   fn z(self, z: int) -> int {
        self.foo1.required_x + self.y + z
    }
}

And used as:

foo(0).call();
foo(0).y(1).call();
foo(0).y(1).z(2).call();

It’s a pretty heavyweight pattern, but provides for overloading and named parameters. A good macro could probably reduce this down really well.

dobkeratops · July 18, 2014, 11:50am

interesting idea,

the off-putting part is it seems to be you now have 3 types of arguments… the receiver, the ‘normal arguments’ and the autotupled arguments, but it does seem to fit

are other suggestions for leveraging arty simpler/more versatile ? … e.g. the idea of treating a different number of args as a different function entirely seemed interesting as you’d be able to retrofit currying or defaults as you desire, via some macros… … what were the downsides to that

pnkfelix · July 18, 2014, 12:41pm

Hmm. I didn't interpret the auto-tupled arguments as a "new type of argument". In particular, I am anticipating the use of a trait to represent different sets of potential arguments, as outlined in the Vec4Args example where it says "Without this RFC, one can emulate this in Rust", so I saw this RFC as giving special support for that pattern when the trait is the final formal parameter in the function. Maybe I have a function-definition centric view of things (rather than a call-site centric view).

pnkfelix · July 18, 2014, 12:44pm

Ah, I should address the builder pattern in the RFC, at least to acknowledge it as an alternative was to encode optional and keyword arguments (as opposed to my personal approach of using a trait for such arguments).

And I should address macros as an alternative as well. E.g. if we added support for macros in the method-identifier position like so: a.foo!(b, c) then that might lead to an alternative encoding for optional and keyword arguments that might be more flexible (but also a tiny bit less natural than the approach outlined in the RFC, due to the exclamation point in the call syntax).

dobkeratops · July 18, 2014, 1:45pm

Hmm. I didn't interpret the auto-tupled arguments as a "new type of argument".

What I really mean there is at the minute you can juggle things into the 'receiver',... you already have the option of tupling and calling a postfix fn. This is how i've handled it in my current maths vecmaths library:

(x,y,z).to_vec4()                  // inserts 0's
(x,y).to_vec4()                    
(some_vec3,w).to_vec4()  // impl of 'ToVec4' for (Vec3<T>,T)
etc

so basically you're adding more choices for similar capability. (macro? receiver? tuple-args?)

this kind of juggling is what has me missing C++ overloading and also longing for D UFCS ... (I realise either would be too big a departure). But i can certainly see the reasoning of how some sort of variadic tuple fits into the existing system.

E.g. if we added support for macros in the method-identifier position like so: a.foo!(b, c) but also a tiny bit less natural than the approach outlined in the RFC

agree, and agree. You could use this for macros that read naturally whilst fighting nesting, that would be a great feature if you're supposed to leverage macros instead of adding complexity elsewhere in the language, although personally i would prefer to see natural function calls beefed up

pnkfelix · August 3, 2014, 9:30am

I just noticed in Servo another instance of the pattern where this RFC would be useful: in rust-cocoa, the send family of methods all take A:ObjCMethodArgs.

https://github.com/servo/rust-cocoa/blob/master/base.rs#L62

pnkfelix · August 5, 2014, 2:41pm

Some people expressed concern about the k+1 case appearing inconsistent.

I gave this further thought today, and thought of a generalization of this Pre-RFC that might address those concerns.

The Pre-RFC as written above tries to treat unit () and tuples (x,y,...) as concepts that can be unified in terms of how this RFC works: the final function argument, if it is a trait, is automatically turned into either () or (x, y, ...) when the number of arguments does not match.

But here is a generalization of the Pre-RFC that does not attempt such unification: Given a function F that takes n formal parameters,

fn F(x_1: X1, x_2: X2, ..., x_n: Xn) { ... }

then at each call-site for F with k actual arguments e_1, e_2, … e_k:

If k == n, then everything works like today
if k < n, then the arguments k+1, k+2, ..., n are replaced with unit () at the call-site.
if k > n, then the n’th argument is replaced with the tuple (e_n, e_n+1, ..., e_k).

(As written, these rules are very simple; perhaps too simple, since they make every call legal and delay the error detection to the trait impl resolution phase. We can circumvent this by only applying the k < n rule when the missing parameters are actually generic type parameters. That is, when the missing parameters are concrete types, then we can detect that error early and report it as a missing argument at the call-site.)

Anyway, this is basically the same as the proposal above, except that if you want to have multiple optional parameters, instead of only modelling them via a single trait packaging all arguments that is implemented for (), T1, (T1,T2), (T1,T2,T3), … (T1,...,Tk) (which is the usage pattern one would always use for the Pre-RFC as originally envisaged), now you can instead pass a trait for each optional argument, and implement () for each such trait. (Or you can keep using the original usage pattern.)

I am not actually sure that this suggestion is an improvement over the change described in the original Pre-RFC draft above. But these rules may seem less “magical”; I am not sure, what do you think?

pnkfelix · March 25, 2019, 8:22am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Re-think `rust-call` and function arguments	5	1537	November 16, 2020
Named & Default Arguments - A Review, Proposal and Macro Implementation language design	26	8027	March 25, 2019
Overloading with tuples language design	6	1394	March 7, 2022
[Idea] Automatic conversion of tuples as function arguments language design	9	2795	March 25, 2019
Pre-RFC: Static Function Argument Unpacking language design	13	1079	August 26, 2024