Idea: Making unnameable types nameable


#1

To start, this idea isn’t fully formed, and I’m not sure if it would be tractable to implement in practice. But I’m not sure it wouldn’t be, so I figured I’d throw it out there, in case others can improve upon it.

I thought of this in a conversation in #rust on April 29, 2018, but didn’t write it up properly until now.

Motivation

For a long time, it was impossible to return a closure by value, because its type can’t be named. This was recently solved by impl Trait:

fn foo()->impl Fn()->i32 + Clone {
  || 5
}

This is good if what you want is an existential type. But that isn’t exactly the same problem as type nameability. And either way, this return value is still an unnameable type, which can be inconvenient for the caller. For instance, I can’t annotate the types of these variables, no matter how much I want to:

let closure = foo();
let clone = closure.clone();
let iter = std::iter::repeat (closure.clone()).take (3);

Also, some conceptual return types are inherently impossible to express through the impl Trait concept (Not my example; suggested by someone in the #rust channel, although unfortunately I don’t have a record of who it was):

fn bar()->impl Into <impl Debug> + Into <impl Debug>

Here, we have a return type that can be converted into either of 2 different Debug types. But the caller could never use the into() implementations of the returned value, because it would be ambiguous which one you were trying to call. In order to call either of them, the caller would have to name one of the types, like so:

let value = bar();
let debug: UnnameableType = value.into();
or
let debug = Into::<UnnameableType>::into(value);

This situation would be resolved by having a way to name all of the unnameable types in question. But how to do that?

Here’s one way that wouldn’t fully solve the problem: C++ partially deals with type naming through the decltype keyword. However, this wouldn’t solve the second problem, and often wouldn’t be an ergonomic approach even if it did work:

fn baz()->decltype(|| {…very long and complex closure…}) {
  || {…very long and complex closure…}
}

Idea

My idea has 2 components: first, a way to assign a name to the type of any value. (This could be used to name the types of closures.) Second, a way to return a type from a function to its caller.

Idea 1: To name the type of a value

Currently, there is no legal type annotation for a variable with an unnameable type. So, I thought, what if we made the type annotation be a way to assign a name to the type?

let closure: MyClosure = || 5; // this line *defines* the name MyClosure to refer to the type of that closure
let clone: MyClosure = closure.clone();
let iter: Take<Repeat<MyClosure>> = std::iter::repeat (closure.clone()).take (3);

If you annotate multiple variables with the same identifier, they would be required to be the same type, much the way match arms have to be the same type even if you don’t specify the type explicitly.

The biggest open question is, what would be the scope in which the identifier MyClosure would have its meaning? My starting assumption was that it would extend to the boundaries of the function it was defined in, but I’m not exactly sure of my reason for assuming that.

Idea 2: To “return” one or more types from a function

In the case of the return value with 2 Into implementations, we’d like to communicate 3 types back to the caller. That’s more than Idea 1 can take care of. So the function needs to have a list of types associated with it.

A function can already have a list of types associated with it: its generic parameters. So, to borrow the syntax from that:

fn foo<returntype T: Fn()->i32 + Clone>()->T {
  let result: T = || 5;
  result
}

That’s explicitly using the syntax from Idea 1. But this implicit form would also work:

fn foo<returntype T: Fn()->i32 + Clone>()->T {
  || 5
}

Any parameter labeled with the keyword returntypewould be determined by the function definition, rather than by the caller. But it could be named by the caller as an (otherwise-undefined) identifier in order to receive the type, as follows:

let closure = foo::<MyClosure>();

which would have the same effect as line from the example from Idea 1. To deal with the 2 Into implementations, you would do this:

fn bar<returntype T: Into <U> + Into <V>, returntype U: Debug, returntype V: Debug>()->T {
  …
}

…

let value = bar::<_, Debug1, Debug2>();
let debug1: Debug1 = value.into();
or
let debug2 = Into::<Debug2>::into(value);

Notes/complications

In the examples I wrote above, I wrote specific trait bounds for the returntypes. On one hand, it seems like it isn’t necessary because they wouldn’t have to be existential types – the caller could be permitted to rely on the concrete type, whatever it is. On the other hand, that means the function signature isn’t very clear about what it’s returning – you’d have to read the body of the function in order to figure out what you were permitted to do with the return values. So it might be clearer to only allow the returntypes to function as existential types. On the third hand, it does seem like it would sometimes be useful to allow the caller to rely on the concrete type. Maybe the pragmatic compromise would be to allow that only within the same module or the same crate.

In the case of a function with no ordinary generic parameters, the returntypes are actually fixed regardless of the inputs. So in that case, it might be possible for the type name created by Idea 1 to exist as an item in the same module as the function. Then it could potentially be used more easily, and it wouldn’t require Idea 2 to communicate it to the caller.

On the other hand, for functions with generic parameters, the function essentially has a mapping from generic parameters to returntypes. The syntax I suggested might not be ideal, because it means there’s no way to refer to that mapping WITHOUT calling the function. (I haven’t thought of a scenario where that would be desirable, but it could theoretically happen.)

In conclusion, there’s a lot about these ideas that I’m unsure about. But I did some web searches for previous discussions about type nameability, and I didn’t find anything along these lines, so I’d be interested to hear what people think.


#2

What about something simpler:

fn foo() -> foo::Closure {
    let result: type Closure = || 5i32;
    result
}

type in type position (as opposed to item position) could be used to declare and match a type, kind of like a binding in a pattern match.


#3

You will, see RFC 2071.

That RFC solves most of these problems.


#4

I don’t know… RFC 2071 cures the symptoms, not the cause.

I can’t say I like the proposed syntax here, or a way to „return“ a type, but assuming this would be something to discuss and find a good solution for, being able to name all types would probably solve a lot of problems ‒ and it attacks the root cause, not just the symptom. On the other hand, I’m not sure if these problems still need to be solved, with impl Trait and all that ‒ if we could have been able to name all the types from the beginning, we probably wouldn’t have needed impl Trait at all.


#5

I can’t really understand this sentiment. Superficially, yes, the root cause is that there are types which are introduced without being bound to an identifier. But the only direct solution to that would be a scheme to tie some unique “inherent” name to every unnameable type (similar to the Foo in struct Foo { ... }). That, in turn requires new syntax in every place – OP’s proposal, for example, adds new syntax for impl Trait return types, but it only helps with existential return types, not with closure types. Meanwhile, RFC 2071 solves all the same problems (and more), in all locations at once (return types, locals, and anywhere else), with a single mechanism.

So why do we want an “inherent” name in the first place? The only thing RFC 2071 it has to “give up” (or rather, not pursue in the first place) is giving one special inherent name to each such type. If we instead just introduce existential type variables in the module scope and show the compiler “by example” which (nameable or unnameable) type each variable abstracts over, we achieve the exact same effects except that the unnameable type is no longer uniquely associated with “its name”, instead it can be referred to by several names if desired. That may not literally solve what we initially identified as root cause here, but it actually solves a broader (and also very natural) problem which subsumes all the practical problems caused by unnameable types.


#6

Slightly off-topicm but what I’m kind of missing in the RFC is using impl Trait in field position. Is that coming at all?


#7

Here’s an example for abstract types (as defined the RFC @Centril mentioned):

abstract type Foo: MyTrait<Item = i32>;
fn foo() -> Foo

Note: The concrete syntax is in flux (see debate about this in the tracking issue

Alternatively there was also an extensive debate about making it possible to name the output of a function by accessing its Output associated type in Pre-Pre-RFC: async methods & bounding async fns I think that this would work best in addition to abstract types, not as a replacement.

I don’t think this is coming. Generic structs require that you define type parameters for a reason (e.g. T in struct Vec<T>). Type params are needed to name the type later, e.g. Vec<i32>. With impl Trait there would be no type parameters and the system can’t work without them.

You’re disregarding a major use case of impl Trait. Instead of returning something like std::iter::Filter<std::iter::Map<std::vec::IntoIter<i32> you can return impl Iterator<Item = i32>. It makes it possible to conveniently not expose implementation details as public API.


#8

Well, I’ve had a night’s sleep and read RFC 2071, so now I’m going to have another go at thinking about this. I’m pretty sure I’ve seen 2071 before, but I didn’t realize how much it overlapped with this situation.

The named existential types from 2071 seem like a better, more well-thought-out version of my Idea 1. That’s pretty cool. After understanding that, the main thing I wasn’t sure about is this: Can 2071’s generic existential types handle the “2 Into implementations” issue when returning from a generic function? And having worked on it a bit, I think the answer is YES! Here, let me write a full example explicitly:

pub trait SomeTrait: Debug {}

pub existential type Debug1 <T: SomeTrait> = impl Debug;
pub existential type Debug2 <T: SomeTrait> = impl Debug;
pub existential type IntoDebugs <T: SomeTrait> = impl Into <Debug1 <T>> + Into <Debug2 <T>>;

struct ConcreteIntoDebugs <T> (T);
#[derive (Debug)]
struct ConcreteDebug1 <T> (T, u32);
#[derive (Debug)]
struct ConcreteDebug2 <T> (T, u64);

impl <T: SomeTrait> Into <Debug1> for ConcreteIntoDebugs <T> {
  fn into (self)->Debug1 <T> {ConcreteDebug1 (self.0, 5)}
}

impl <T: SomeTrait> Into <Debug2> for ConcreteIntoDebugs <T> {
  fn into (self)->Debug2 <T> {ConcreteDebug2 (self.0, 5)}
}

pub fn return_into_debugs <T: SomeTrait> (argument: T)->IntoDebugs <T> {
  ConcreteIntoDebugs (argument)
}

My Idea 2, in hindsight, just represented a mapping from the generic parameters to a series of types, and 2071 lets us present that mapping explicitly as a series of existential type declarations.

I do see one inconvenience left. What about this case:

pub existential type Debug1 = impl Debug; //I want this to be u32
pub existential type Debug2 = impl Debug; //I want this to be u64
pub existential type IntoDebugs = impl Into <Debug1> + Into <Debug2>;

pub fn return_into_debugs ()->IntoDebugs {
  0u32
}

The inconvenience here is that – unless I’ve missed something from 2071 – there’s no natural place to specify the concrete types of Debug1 and Debug2 if the module doesn’t actually use any objects of those types directly. Of course, I could force it:

#[allow (dead_code)]
fn force_debug1()->Debug1 {0u32}

But it does seem like a bit of a gap in the syntax.


#9

Maybe to clarify ‒ I was saying that when we already have impl Trait, the need to be able to name the types is not that big any more (even when there still could be some use cases).

On the other hand, if we had named closure types first, there still would be some use cases for impl Trait, but the need would be much smaller and that need would probably not warrant a big feature like impl Trait (and even if it happened eventually, considering how new impl Trait is, it would not be here yet without such a pressing need).

So, while either one has certain benefits over the other, and it would have been great to consider having named closure types before we had impl Trait as an alternative (I guess it have been considered), and I believe it might be worth a short while to think about a natural way of naming them, the time for that is probably mostly past.


#10

IMO, impl Trait has actually made the problem worse, not better.

Having unnameable types was always a serious problem in Rust, but as long as they couldn’t propagate across API boundaries, the damage was limited. impl Trait was designed to treat the symptoms of having unnameable types without addressing the root cause, and as a result, exacerbated the problem. Now, unnameable types are not only easy to use, but the language actively encourages them by making it easier to use them than to do things the proper way. Combined with the viral nature of impl Trait, this is a recipe for an explosion in unnameable types and hence the need to deal with them.


#11

Just thought of another small inelegance related to 2071’s named existential types, and some other thoughts expanding on it:

I presume you can use one of the existential types named using the 2071 syntax, like this:

mod module {
  pub existential type Existential = impl Clone + Hash + Debug;
  // … something that gives Existential a concrete type
}

pub use module::Existential;

And even if you can’t name it directly, you can use 2071 to give it a name:

mod module {
  pub fn returns_existential()->impl Clone + Hash + Debug;
}

pub existential type Existential = impl Clone + Hash + Debug;
// … something that gives the type Existential to the return value of module::returns_existential()

But you can’t use it, meaning that you have to duplicate the list of trait bounds, as seen above.

Again, it seems like there’s a slight difference between making these type declarations existential and making them implicit (defined by how the typename is used in the module, rather than defined on the type Typename = Type line). 2071 provides declarations that are both existential and implicit, and we already have declarations that are neither existential nor implicit, but we don’t have declarations that are existential but explicit (as I mentioned in my last post) nor ones that are implicit but not existential (the subject of this post). With implicit, non-existential type aliases, you could do this:

mod module {
  pub fn returns_existential()->impl Clone + Hash + Debug;
}

pub type Existential;
// … something that gives the type Existential to the return value of module::returns_existential()

Of course, this has the problem I mentioned earlier where the type line gives you no information about what the type can do, and you have to look through the whole module to find what it can do. So 2071 pragmatically hides this problem behind the fact that it’s also an existential type and therefore expresses what it can do directly. But that leaves a few gaps in what the feature can do. Ideally, you might be able to do something like this:

use [the return value of module::returns_existential] as Existential;

Which, like any other use statement, doesn’t explicitly say what the type is like, but tells you exactly where you can look it up.

Or maybe we could encourage returning 2071-style named existential types rather than ever returning an anonymous impl Trait, for the convenience of callers. Hmmm…


#12

That’s too bad, as it means that something like an iterator still cannot be stored in a struct without boxing, and that’s annoying at best, and has real performance costs at worst. Perhaps being able to store instances of unboxed DST’s in structs would help there?


#13

You can use abstract type (assuming that happens) here, can’t you?

abstract type InnerIterator = impl Iterator<Item=u32>;
fn numbers() -> impl Iterator<Item=u32> {
    iter::repeat(0).map(|i| i+5)
}
struct Container(InnerIterator);

fn main() {
    let it = numbers();
    let c = Container(it);
}

@MajorBreakfast seemed to imply that impl Trait in structs would be generic instead of existential. I’d expect existential, but given that both generic and existential impl Trait exist in function argument and return position respectively, I can see either interpretation being used for within structs.

(That said, I’m still on the side of supporting argument impl Trait as unturbofishable generics but that’s off-topic here. I just wish that didn’t preclude existential struct members using impl Trait syntax.)

EDIT:

Looking at my example again my brain decided to intuit that tuple struct position impl Trait should be generic (it looks too much like function syntax) and named member impl Trait should be existential. That pretty much convinced me that impl Trait in structs doesn’t have that one obvious meaning that function argument and function return (modulo details) do.


#14

@jjpe I meant this:

struct MyStruct1<I: Iterator<Item = i32>> {
    inner: I,
}
fn my_fn1(x: MyStruct1<MyIter>) { ... }

struct MyStruct2 {
    inner: impl Iterator<Item = i32>,
} 
fn my_fn2(x: MyStruct2<???>) { ... } // No way to name the type

As you can see it’s already possible with generics. In particular it doesn’t require boxing. It just can’t use the impl Trait syntax.

How would existential work?


#15

The exact way that an abstract type or return position existential works. The compiler figures out the concrete type and complains if it can’t unify it. You can only use it through the trait contact.

The field probably wouldn’t be able to be public.


#16

Can you give a code example? I don’t understand how this can work.

  • How can I name the resulting struct type?
  • How is a concrete type assigned to the abstract type?
  • How is it indicated which part is abstract?

#17
fn make_iter() -> impl Iterator<Item=u32> { ... }

struct MyIterator {
    inner: impl Iterator<Item=u32>,
}

impl MyIterator {
    fn new() -> Self {
        MyIterator { inner: make_iter() }
    }
}

impl Iterator for MyIterator {
    type Item = u32;
    fn next(&mut self) -> &u32 {
        self.inner.next()
    }
}

MyIterator. There’s only one such type.

There would have to be at least one instance in the crate where the field is assigned. It would be treated the same as abstract type would – the compiler complains if more than one type is used for it. I’m not sure exactly how abstract type would function, I’ve just seen it in passing.

In effect, this is a simpler way of using abstract type to embed an unnameable type in a struct, except it still leaves the type unnamed.

I don’t think the field could be assignable from outside of the current crate, as the other crate a) should only rely on the trait impl anyway and b) has no way of knowing what the actual concrete type is.

I’m not sure exactly what the question is here. It should function the same way as I (and probably others) intuit impl Trait would work on let binding type hints:

let it: impl Iterator<Item=u32> = make_iter();

That is, to be specific, that the field is bound to implement the given trait(s), will only be used through said trait(s), and I don’t care what the actual concrete type is, figure it out for me.


(EDIT: It’s possible I mixed up terminology and used existential as the wrong polarity thus the confusion, if so, I apologise. It’s so late where I am that it’s early again. I’m going to go recharge and respond more coherently later.)


#18

@CAD97 Thanks for the example!

I think that this would work. However, I think that it isn’t better than our current system. Here’s what I don’t like so much:

  • The context doesn’t make it clear that this impl Trait is existential. To me it looks like it could be generic. With return-impl Trait it is intuitively clear that it has to be existential.
  • The struct field can’t tell us where its concrete type is assigned. It can be anywhere, even in another file because the impl for the struct can be in another file.
  • It’s not so simple as return-impl Trait because there can be more than one point where we assign a concrete type. These concrete types all need to be one and the same.

#19

WARNING: this is a bit of a rant, it does solely represent my personal opinion

Honestly impl Trait was always about existential types.

The while think about making it universal (generic) for function parameters was because 1. any (current) use case which works with it being existential for params also works with it being generic and 2. because it is suppositiously easier for people new to rust.

I say suppositiously because now people have to learn that it’s sometimes existential and sometimes isn’t which in my POV is more complex and not simpler. (and it adding a generic parameter implicitly to a function brings a bunch of problems from how the turbo-fish operator is supposed to work with it to making some forms of HKT impossible or at last less consistend for rust (not that we are anywhere close to them anyway)).

With the original idea behind impl Trai enabling it in other positions would be straight forward:

type X = impl SomeTrait; //<- type X always did just name a Trait and so does it now
struct A {
  x: impl SomeTrait //<- clearly just one specific unnamed type
}
let x: impl Trait = ...; //<- also clear 
fn bla(x: impl Trait) {} //<- bla is not generic sligtly confusing if you didn't learn about impl Trait yet but we could just discourage this usage with a lint or not enable it
trait T {
  type Y: SomeTrait;
  fn a() -> Self::Y;
}
impl T for S  {
  type Y = impl SomeTrait; 
  fn a() -> Self::Y; //<- would be lovely, maybe a bit trick to impl in the compiler
}

But now no of this thinks are clear for an beginner an in most cases where I did wrote impl Trait in a function parameter I changed it later on as impl Trait was to unflexible…

IMHO making impl Trait possible universal was the worst decision done in rust since 1.0 (through there where also arguments for it, it’s just that I personally doesn’t think it was worth it) was a bad decision

sorry, the crossed out text might have been a bit impolite