[Pre-RFC] Capturing inferred types in a type alias

Hi rustaceans!

This is me first time posting here. Please tell me what you think about this suggestion. (Maybe it has already been considered in one form or another? But I couldn't find anything similar).

The intent is primarily to enable referring to otherwise anonymous types (closures, futures) by means of a type alias, allowing us to use them for function return values and store them in structs, while also being broadly useful anywhere you get a composition of some more complex generic types.

Basic idea

There are many functions that compose generic types. When using the returned value in a local variable, the compiler is able to infer the return type and save us a lot of pain writing it out. Consider the following code (It's a bit silly, as there are far better ways to achieve the same thing. But lets roll with it for the sake of example):

use std::iter::once;
let foo = once("Hello")
    .chain(once(","))
    .chain(once("world"))
    .chain(once("!"));

My suggestion is to add a more explicit version of this feature that also introduces a type alias for the inferred type:

use std::iter::once;
let foo: type HelloWorldIter = once("Hello")
    .chain(once(","))
    .chain(once("world"))
    .chain(once("!"));

Now that we captured the inferred type in an alias, we can do stuff with it and e.g. use it as a struct member:

struct Foo{
    hello: HelloWorldIter,
}

Such type aliases would be confined in the current scope, so to use them globally, we would need to define them in a function return type instead.

Function return types

For the return value of a function currently the compiler won't infer types for us and we're forced to write out the full type, which with more complex generic types can get out of hand pretty fast.

use std::iter::{once, Chain, Once};

// It's a bit annoying to write out this type,
// even before considering that you might
// want to change it later while refactoring...
type Chain2 = Chain<Once<&'static str>, Once<&'static str>>;
type Chain3 = Chain<Chain2, Once<&'static str>>;
type Chain4 = Chain<Chain3, Once<&'static str>>;
type HelloWorldIter = Chain4;

fn hello_world_iter_the_hard_way()
-> HelloWorldIter {
    once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!"))
} 

Sometimes we can return a boxed trait object instead, but this introduces unnecessary runtime overhead:

// This is easier to write, but introduces
// memory allocations and vtables
fn hello_world_iter_runtime_overhead()
-> Box<dyn Iterator<Item = &'static str>> {
    Box::new(once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!")))
}

Instead, we could let the compiler infer the return type and capture it in an alias that we can use elsewhere in the program:

fn hello_world_iter() -> type HelloWorldIter {
    once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!"))
}

If trait bounds are added, the compiler checks that the aliased type really implements those. I think this is also a better way than stating the exact return type, because it communicates the intent of the function much clearer:

fn hello_world_iter()
-> type HelloWorldIter: Iterator<Item = &'static str> {
    once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!"))
} 

Generics and lifetimes

It also should work with generic types and lifetimes. The generic parameters get captured in a generic type alias:

fn foo1<'t, A>(a: &'t A) -> type RefOption<'t, A> {
    Some(a)
}

Which would be the same as:

type RefOption<'t, A> = Option<&'t A>;
fn foo1<'t, A>(a: &'t A) -> RefOption<'t, A> {
    Some(a)
}

Also, instead of capturing the complete return type, pattern matching would allow capturing generic parameters and allow multiple type aliases:

fn foo2(p: bool) -> Result<type A, type B> {
    if p {Ok("foo")} else {Err(1u32)}
}

This would be the same as writing:

type A = &'static str;
type B = u32;
fn foo2(p: bool) -> Result<A, B> {
    if p {Ok("foo")} else {Err(1u32)}
}

Async Functions

Where I really want to get with this though are closures and futures created by async functions. They currently have no name, so we can't explicitly refer to them, but the compiler infers their type when using them as a local variable:

async fn async_hello<'t>(name: &'t str) -> [&'t str; 4] {
    ["Hello", ",", name, "!"]
}

async fn async_hello_world() -> [&'static str; 4] {
    // Compiler infers type here, so why can we have this
    // as a local variable, but are unable to use this as a
    // function return value or in a struct?
    let hello = async_hello("world");
    hello.await
}

Because of this limitation, if we want to return those, or store them in a struct, we are currently forced to use a boxed trait object instead, which results in runtime overhead:

use std::future::Future;

async fn async_hello<'t>(name: &'t str) -> [&'t str; 4] {
    ["Hello", ",", name, "!"]
}

fn hello_future_runtime_overhead<'t>(name: &'t str)
-> Box<dyn Future<Output = [&'t str; 4]> + 't> {
    // This memory allocation is completely unnecessary,
    // yet we are forced to do it.
    Box::new(async_hello(name))
}

Being able to create a type alias for those types would solve that problem. Having an alias to call it by, we can return it from the function by value and use it in the rest of our program:

use std::future::Future;

async fn async_hello<'t>(name: &'t str) -> [&'t str; 4] {
    ["Hello", ",", name, "!"]
}

fn hello_future<'t>(name: &'t str)
-> type HelloFuture<'t>: Future<Output = [&'t str; 4]> + 't {
    async_hello(name)
}

struct Foo<'t> {
    hello: HelloFuture<'t>,
}

Maybe there could also be some more syntactic sugar for async functions, making the hello_future function in the previous example unnecessary:

async fn hello<'t>(name: &'t str)
-> async type HelloFuture<'t> -> [&'t str; 4] {
    ["Hello", ",", name, "!"]
}

Closures

In the same way this would allow us to directly return a closure from a function, so instead of returning boxed trait objects

fn incrementing(mut a: u32) 
-> Box<dyn FnMut() -> u32> {
    Box::new(move || {a = a + 1; a})
}

fn incrementing_ref<'t>(a: &'t mut u32) 
-> Box<dyn FnMut() -> u32 + 't> {
    Box::new(move || {*a = *a + 1; *a})
}

we could return the closures directly:

fn incrementing(mut a: u32) 
-> type IncrementingClosure: FnMut() -> u32 {
    move || {a = a + 1; a}
}

fn incrementing_ref<'t>(a: &'t mut u32) 
-> type IncrementingRefClosure<'t>: FnMut() -> u32 + 't {
    move || {*a = *a + 1; *a}
}

Public interface

For normal types, pub could just be added to make them part of the public interface:

fn foo() -> pub type Foo {1u32} 

The same rules apply, as if the type alias was declared the usual way:

pub type Foo = u32;

With closures and async functions, there is a slight problem though. We cannot declare the type itself as public as we have not declared it in the first place and the compiler implicitly added it. Defining such types to always be public should work though, as there is no way to refer to them without our alias (which we can choose to be public or private as we please). So doing this should also just work:

fn incrementing(mut a: u32) 
-> pub type IncrementingClosure: FnMut() -> u32 {
    move || {a = a + 1; a}
}

Impls

struct Foo();
impl Foo {
    fn foo() -> type Associated {0u32}
}

This would add an associated type to struct Foo, as soon as RFC 195 is implemented. Until then it would just be invalid.

Traits

This could also be used to set the associated type of a trait:

struct Foo();
impl Iterator for Foo {
    fn next(&mut self) -> Option<type Item> {Some(0u32)}
}

would be equal to

struct Foo();
impl Iterator for Foo {
    type Item = u32;
    fn next(&mut self) -> Option<u32> {Some(0u32)}
}
5 Likes

This sounds like https://github.com/rust-lang/rfcs/blob/master/text/2071-impl-trait-existential-types.md.

5 Likes

I believe the current version of this is "existential type aliases" (or whatever it was called) which is mildly more expressive because it allows for more unification points:

existential type T;
fn foo() -> T { .. }
fn bar() -> T { .. }

I believe this is in nightly; I suggest searching existing RFCs for it.

That said, I think your syntax is vastly superior, since I consider cross-function unification like this a bit gross. I have also, separately proposed being able to write

fn foo() -> u32 { .. }
let x: foo::return = foo();  // x: u32

which, combined with impl Trait, provides a somewhat nicer version of what you want than -> type T: Trait. In particular, if you want to put parameters on the return type alias, you get them for free from the function: foo::<'a, T>::return.

2 Likes

Thanks! The list of rfcs is pretty long already, so much I must have missed that one. It indeed looks similar in what it can do and probably a lot more thought through.

1 Like

Yes, that's indeed achieving the same goals as my proposal. Good to know this already exists. I like the way it hides unnecessary implementation details, which my proposal doesn't, so I totally agree that it's more expressive. The only annoying thing I see with it is that it adds another key word to the language.

1 Like

The current tracking issue for "existential types" is https://github.com/rust-lang/rust/issues/63063, RFC 2515, "Permit impl Trait in type aliases".

The used syntax is type T = impl Trait; (so no existential positional keyword), and exposes a type that only exposes the implemented trait, but has a name and can be used to unify multiple type locations.

I do actually like this syntax as a sugar for existentials (the same way impl Trait in argument position is sugar for generics, or in return position is sugar for (unnamed) exestentials), but it has a small issue that's solved via the existentials syntax: how much of the type is actually exposed?

Given your one example,

type Chain2 = Chain<Once<&'static str>, Once<&'static str>>;
type Chain3 = Chain<Chain2, Once<&'static str>>;
type Chain4 = Chain<Chain3, Once<&'static str>>;
type HelloWorldIter = Chain4;

fn hello_world_iter_the_hard_way()
-> HelloWorldIter {
    once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!"))
} 

// sugared to

fn hello_world_iter() -> type HelloWorldIter {
    once("Hello")
        .chain(once(","))
        .chain(once("world"))
        .chain(once("!"))
}

would suggest that the entire information about the type is available. (iow, it's just sugar for the type alias, rather than the existential type.) I think this is actually useful in some cases, but it provides a semver hazard. Even if you provide a bound that the compiler can enforce is present, the compiler has no way to help you not change the type in the future, therefore having a semver-major change to your API.

This is why existential types are useful: they allow you to hide the implementation details of what concrete type is used, only promising to fulfill some trait interface. This just comes at the cost of conditionally implementing traits.

Perhaps that can be recovered, though, via "partial existentials". Just as a small example,

// this is currently allowed
fn test1<A: Iterator<Item=&'static str>>(x: A) -> Chain<A, impl Iterator<Item=&'static str>> {
    x.chain(once(","))
}

// but this (with type_alias_impl_trait) isn't (yet)
type Chained<A> = Chain<A, impl Iterator<Item=&'static str>>;
fn test2<A: Iterator<Item=&'static str>>(x: A) ->Chained<A> {
    x.chain(once(","))
}

This could just be an oversight of the current implementation (I've raised this on the tracking issue) and I'm just rambling at this point so I'm done talking for now.

4 Likes

The way I understand it is, that the "existential types" syntax allows the current module to know the actual type while it is hidden from the rest of the program. That is usually the wanted behavior, so I really prefer this over just my suggestion.

You're the second person who said they liked the syntax, so maybe it would still be nice to have this as a syntactic sugar for the simpler cases of both existential types and type aliases. Something like:

// sugar for type aliases, the trait bounds being optional, but enforced if present:
fn foo1() -> type IterFoo1: Iterator<Item = &'static str> { once("foo1") }

// sugar for existential types, the trait bounds are mandatory and the only
// information exposed about this type. Maybe it could even go further and
// hide the exact type from the current module.
fn foo2() -> type IterFoo2 impl Iterator<Item = &'static str> { once("foo2") }

It's interesting that fn name() -> type Name would be functionally equivalent to (at least part of) RFC#2524, "Permit _ in type aliases" (as a transparent type as opposed to opaque).

And I think it makes sense for the scoping rules to be smaller:

// This is existential within the scope of the module,
// and can have defining usages anywhere within the module:
type Iter1 = impl Iterator;
fn iter1() -> Iter1 { ... }

// This is an existential within the scope of the function,
// and is definitely defined by the return type of the function:
fn iter2() -> type Iter2: impl Iterator { ... }

Whether the tightened scope and more compact syntax are worth it, I don't know.

Additionally, given RFC#2524's "slight negative" response, I think the transparent alias version (as opposed to TAIT) is probably also probably undesirable. It's the same reason we don't allow, say, fn test() -> _ for return type inference and require full nominal typing of these declarative interfaces.

In fact, there was a change a few versions back to "accept" -> _ in the compiler for the purpose of diagnostics. If you write -> _, it will actually go through the full rustc compiler frontend with the return type as a type error, but will actually tell you exactly what type to fill in.

error[E0121]: the type placeholder `_` is not allowed within types on item signatures
 --> src/lib.rs:1:15
  |
1 | fn test1() -> _ {
  |               ^
  |               |
  |               not allowed in type signatures
  |               help: replace `_` with the correct return type: `i32`
1 Like

Is there currently a way for a type however defined to be transparent in module/crate but opaque outside? Or for the outside world to be aware of a lesser number of traits implemented?

A struct with all non-pub members sort of does it but still?

I suppose a related thread is having local scope inference variables. I.e.,

let iter: type MyIter = ...;
let size = size_of::<MyIter>();

It is unclear how useful this is in practice, though.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.