Pre-RFC: #[repr(rust)] variadic functions

I was referring to Iterators in general. It's likely that not all cases will involve IntoIter for arrays. For example, due to the current lack of genericity I tend to shun arrays in favor of Vec in extant code, even though that brings with it a heap allocation penalty. Perhaps that will someday be solved, but today is not that day, nor wil it soon come.

I don't see what using Vec<T> has to do with variadic arguments. Iterators are based on top of what every data-structure they iterate over. It doesn't seem to me that the fact that the data happens to be on the heap is a concern.

Vec is an example. The point is that you have no control over, and don't even know what implements IntoIter in general, and how it does so. All you know is it adheres to the trait contract.

Yes that is the point. To make the feature as general as possible. However, the alternative of allocating a known size array or Vec and then copying over all the elements is substantially more work in general.

Sebastian Malton

Well, solved ones:

This is why you're equating Iterator<=>allocation. The easiest (and it will be for a while) way to get an [into]iterator over T is a Vec<T>, because [T; N] isn't IntoIterator yet, even for N <= 32. There's work on future-compat lints to make adding this impl break less (poorly written) code that coerces to using &[T] as IntoIterator (is it in yet? I don't know).

If you take Vec<T> for your variadic because it's easy, it will require allocation. If you take impl IntoIterator<Item=T>, then you can take either the easy Vec<T> or the allocation-free combination of iter::Chain and iter::Once that I've provided macro sugar for above in isplat!.

And don't underestimate how clever the optimizer can be. If the function is generic and thus knows the exact size/contents of the iterator at compile time, it might even be able to optimize the function to passing them as individual arguments in the calling convention.

And that's a bad thing how? There's no real reason to prevent the use of a Vec for a variadic. Let the caller decide which is better for their case, the static isplat or the dynamic Vec. For huge arrays or cases optimizing for code size with calls of your function with many different lengths, it may be the the best option.

It's the same principle as (&dyn Trait): Trait: let the caller decide their tradeoffs. You write the generic version, but the caller is allowed to chose virtual dispatch (for variadics, a dynamic size collection such as Vec), because in some cases that's better than copy/pasting the generic function for every use.

To be completely fair: that is the main downside of using impl IntoIterator over Vec or even [_; N]: generic code duplication. In the [_; N] case every same-length call uses the same overload. In the Vec case, every call uses the same overload. With impl IntoIterator, it's much more easy to have every call create a new unique overload, with the code duplication that comes along with it. The ones purely iter::chaining together iter::onces will all be the same overload, but the moment you introduce a different iterator into the chain, it's likely a unique type and thus overload. And this one can't be handled via trait objects, because IntoIterator isn't object safe (yet? alloca?).

So maybe you could use impl Iterator instead of impl IntoIterator; after all, ichain!/isplat! return iterators directly, not just iterables. This would even help discourage the simple case of just passing a Vec or other collection, as it'd have to be vec![...].into_iter(), which would hopefully hint at the alternative allocation-free options.

4 Likes

That is a very well written post, thank you.

One point though, why have both ichain! and isplat! since isplat! covers the cases where ichain! works.

The main reason I wrote both is that ichain is something that might actually be upstreamed to itertools. ichain follows the existing pattern with izip; isplat is more unique/specialized, and would be a harder sell to include in itertools.

What about interoperability with c_variadic?

1 Like

I think that there need only be a macro (probably built-in) that converts from any iterator to whatever representation is needed for c variadics.

They are unsafe anyway.

1 Like

(args: ...) and (args: ...T) are two different things.

I always feel there's a need for some "tuple_slice" thing, which is an abstraction over ordered heterogeneous sequence. Not sure how to make it zero cost though.

1 Like

Variadic generics.

So the only real concern that I see for a feature like this is the lack of the ability to use non-boxed trait objects. However, since they are not supported at all I don't think that this is/should be a blocker for such a feature.

There is also no need for a "spread" operator, like in golang, since the way that it would be implemented seems to be with iterators and IntoIterator<T>.

It would be nice if there was interoperability between this feature and c_variadic's, but I need to do some more investigation as to how this would work. If my understanding of the spec for C style variadic functions the type T would have to be Sized and the va_arg, va_start, va_end, and va_copy seem to be a sort of iterator, which, hopefully, makes it "easy" to put on top of rust's iterator.

1 Like

I don't think there would much to gain from that. C variadics are kind of awful, and extremely type-unsafe, even more than regular C.

If you want to call eg C's printf from Rust, you'd need a fairly heavy Rust wrapper to enforce type-safety, at which point you might as well use a native implementation.

1 Like

More fundamentally, the c_variadic feature in Rust is an FFI feature. Unless your project includes some C code declaring or calling a C variadic function, IIUC there's no benefit to using c_variadic in your Rust code that you can't achieve far more easily and safely with standard Rust containers, slices, trait objects and/or variadic macros. So even if stronger interop was actually possible, I don't see how we'd get any significant value out of it.

1 Like

I have written up the beginnings of an RFC, here is the rendered link: https://github.com/Nokel81/rfcs/blob/be1af954c7c013f4e6735cc2d20e360dd70f57db/text/0000-rust-variadics.md.

It would be great if it could be read it over before I put it up against the rfc repo.

1 Like

State of the art is to open a self-PR (i.e. from your/branch to your/master) so people can leave comments in the thread there. I've got a few minor notes I'd leave there.

But the TL;DR of my notes is that you need to show a meaningful reason this would be better than just taking impl [Into]Iterator.

And one other big note: what about a T that implements IntoIterator<T>?

1 Like

I agree with CAD97 here. A language change for the homogeneous case doesn't seem worth it when people are already working on making [T; N]: IntoIterator<Item=T>, at which point just calling foobar(MyType::new(), [1, 2, 3]) seems plenty good enough.

I do think a language change will be worth it at some point to allow heterogeneous variadic functions -- which includes the homogenious case but will also open up a bunch of interesting possibilities for Fn and tuples and friends, like const generics are doing for arrays.

8 Likes

I like the whole idea!

Note that std::iter::once and Iterator::chain have some memory overhead, that can't be optimized away unless the function is inlined (see this godbolt link).

Once<T> is internally represented as Option<T>, and Chain<A, B> is defined as

pub struct Chain<A, B> {
    a: A,
    b: B,
    state: ChainState,
}

If sum_all(0, 1, 2, 3, 4) desugars to

sum_all(0, once(1).chain(once(2)).chain(once(3)).chain(once(4)))

then the concrete type of the iterator is

Chain<Chain<Chain<Once<i32>, Once<i32>>, Once<i32>>, Once<i32>>

So it might be better to use something like ArrayIter<[i32; 4]> instead to minimise memory overhead. Such an iterator doesn't exist yet, but I believe there are plans to implement IntoIterator for [T; N].

So sum_all(0, 1, 2, rest...) could desugar to

sum_all(0, [1, 2].into_iter().chain(rest))

Syntax

I'd prefer a slightly more explicit syntax:

fn sum_all(start: usize, items: ...impl Iterator<Item = usize>) -> usize;

This has several benetfits:

  • The type of items is visible, so less "magic" is happening
  • You can add lifetimes or trait bounds, e.g.
    items: ...impl Iterator<Item = Foo> + 'a
    
  • You can use a generic type and specify bounds in a where clause, giving you more flexibility
  • We can later add support for other types of variadics, some examples that come to mind:
    • impl DoubleEndedIterator<Item = Foo>
    • impl Stream<Item = Foo>
    • Box<dyn Iterator<Item = Foo>>
    • Vec<Foo>

We could just do this now

fn sum_all(start: usize, items: impl IntoIterator<Item = usize>) -> usize;

and call it like

sum_all(0, [1, 2]); // assuming arrays implement `IntoIterator` sometime in the future

// in the meantime, we can use something like `arrayvec` or once chains

So there is no real need for homogeneous variadic functions

4 Likes