Syntax sugar for turbofish/generics

Turbofish can be tamed by having syntax sugar for Types referring to one T only. If we imagine them to be a hierarchical structure (like directory tree) we can use a backslash \ to chain the elements.

To understand how common a long turbofish is, I have done overview of Bevy and Tokio codebases, and they have a lot of sequences of >>> and >>, and a limited number of >>>> turbofish "heads".

Here's some examples (including where the sugar is not effective) and with and without sugar applied

From Bevy:

pub type RemovedIter<'a> = iter::Map<
    iter::Flatten\option::IntoIter\iter::Cloned\EventIterator<'a, RemovedComponentEntity>,
    fn(RemovedComponentEntity) -> Entity,
>;

// de-sugared
pub type RemovedIter<'a> = iter::Map<
    iter::Flatten<option::IntoIter<iter::Cloned<EventIterator<'a, RemovedComponentEntity>>>>,
    fn(RemovedComponentEntity) -> Entity,
>;
mut sysinfo: Local\Option\Arc\Mutex\System,
mut sysinfo: Local<Option<Arc<Mutex<System>>>>,

changed_by: MaybeLocation\ThinArrayPtr\UnsafeCell<&'static Location\'static>,
changed_by: MaybeLocation<ThinArrayPtr<UnsafeCell<&'static Location<'static>>>>,

let spawned: ConcurrentQueue\FallibleTask\Result<T, Box\(dyn core::any::Any + Send)>
let spawned: ConcurrentQueue<FallibleTask<Result<T, Box<(dyn core::any::Any + Send)>>>>

From Tokio:

#[derive(Clone)]
struct DropWaker(
    Arc\AtomicBool,
    Arc\Mutex\Vec\Pin\Box\tokio::time::Sleep,
);

// de-sugared
#[derive(Clone)]
struct DropWaker(
    Arc<AtomicBool>,
    Arc<Mutex<Vec<Pin<Box<tokio::time::Sleep>>>>>,
);

pub(crate) struct RcCell\T {
    inner: UnsafeCell\Option\Rc\T,
}

// de-sugared
pub(crate) struct RcCell<T> {
    inner: UnsafeCell<Option<Rc<T>>>,
}
Mutex\Option\VecDeque\task::Notified\Arc\Shared
Mutex<Option<VecDeque<task::Notified<Arc<Shared>>>>>

Poll\Result<OwnedPermit\T, PollSendError\T>
Poll<Result<OwnedPermit<T>, PollSendError<T>>>

type InnerFuture<'a, T> = ReusableBoxFuture<'a, Result<OwnedPermit\T, PollSendError\T>>;
type InnerFuture<'a, T> = ReusableBoxFuture<'a, Result<OwnedPermit<T>, PollSendError<T>>>;

assert_send_sync::<JoinHandle\std::cell::Cell\()>()
assert_send_sync::<JoinHandle<std::cell::Cell<()>>>()

async_assert_fn!(tokio::sync::OnceCell<NN>::get_or_try_init( _, fn() -> Pin\Box<dyn Future<Output = std::io::Result\NN>>): !Send & !Sync & !Unpin);
async_assert_fn!(tokio::sync::OnceCell<NN>::get_or_try_init( _, fn() -> Pin<Box<dyn Future<Output = std::io::Result<NN>>>>): !Send & !Sync & !Unpin);

Previous works

From: Uther
If you want that kind of syntax you will need another character for generics or you will need to change the macro invocation. For instance `Option`Rc`RefCell`Node` would seem pretty clean to me
Ref: Cleaner syntax for generics - #23 by Uther

Pros (subjective ofc):

  • Improves readability (everyone knows how directories work)
  • Removes the turbofish head

Cons (subjective):

  • Introduces another way to do the same thing,
  • No one cares about the turbofish head, we have modern editors to help us with the nesting of T's,
  • Doesn't make it significantly more concise,
  • Doesn't work with entities which accept two type-parameters.

Opinions?

1 Like

Probably not enough pros to justify the change.

https://lang-team.rust-lang.org/frequently-requested-changes.html#fundamental-changes-to-rust-syntax

12 Likes

I think you really like Windows paths :sweat_smile:

15 Likes

There's a fundamental problem with any attempt to remove the right-hand delimiter from generic notation: in combination with default type parameters, it means you can no longer tell the difference between T<U<A>,B> and T<U<A,B>>.

If we were going to change anything in this area, I think more reasonable changes would be

  • Take the turbo off the fish. size_of<u32>() instead of size_of::<u32>(). I know there are syntactic ambiguity reasons why we don't already do this but I bet the cases where the shorter construct is actually ambiguous are very rare.
  • Square brackets instead of angle brackets: Result[T, E]. Square brackets are underutilized in the language and are less visually obtrusive than angle brackets. This trades one set of syntactic ambiguities for another, we should probably make a list of all of them on both sides of the change before actually deciding to do it.
  • A keyword that lets you pre-declare type variables so you don't have to write <T> twice in impl<T> Trait for Struct<T>.
4 Likes

This seems likely to be a very subjective matter, and in particular one that'll vary based on people's mental lexers. Personally, given the strong association between [] and array indexing, when I encounter a language that uses [] for anything other than arrays/lists I find that visually confusing. (That is not meant to be an objective assessment, just an observation that different people will have different subjective assessments.)

The combination of that with the high level of justification needed to make a core syntactic change like this makes it seem unlikely to me.

I imagine we could probably do it over an edition, if we really want to and didn't mind ruling out the ambiguous cases. However, compiler folks have said they'd really like to avoid adding any more cases of edition-sensitive parsing than are absolutely necessary to support language features. It doesn't seem worth it for this case.

I can imagine a few other possible solutions for this, if there's energy to pursue them. For instance, we could pick a new, unambiguous syntax for postfix type ascription on expressions, which would also have the advantage of supporting .into(). By way of example but not meant for syntax bikeshedding (since the syntax is less important than the concept), .as(u32). (That doesn't solve all cases of turbofish, but it solves many.)

Not that I am in favor of the proposal, but it’s described as sugar for the single-argument case. So T<U\A, B> in this proposal is unambiguously the first interpretation; the second can be written T\U<A, B> or just left as is.

4 Likes

impl <T> trait for Struct<T> and impl trait for Struct<T> mean different things if you have type T in scope.

I interpreted the post to say that you could do something like

impl Trait for Struct<fresh T>

to declare a named param at the point of first use.

That said, despite having an RFC for it -- 2115-argument-lifetimes - The Rust RFC Book -- we unaccepted the "use lifetimes without needing a separate declaration" feature, so I don't know how much appetite there'd be for something like this.

More likely would be some kind of type elision or impl trait thing instead, I think. If the type doesn't need a bound and you don't need to mention it again (just Self is enough)), maybe it'd be fine for impl Trait for Struct<_> to just work, like how impl Trait for Struct<'_> just works. For example, impl Default for Option<_> { fn default() -> Self { None } } seems fine?

5 Likes

In a similar vein to accepting _, I hope at some point we get to use impl trait in impl-generic-arguments

impl Default for Wrap<impl Default> {
    fn default() -> Self {
        Self(<_>::default())
    }
}

Then the only time you need to name the parameter is when you actually need to use that name.

4 Likes

That's not quite what I had in mind; I was imagining something more like Python's TypeVar. The point would not just be to shorten individual impl block heads, but to factor out repeated trait bounds.

A simple, real example from my own code:

trait DmOp: Sized { ... }

struct Request<Op: DmOp> { ... }
struct Response<Op: DmOp> { ... }

impl<Op: DmOp> Request<Op> { ... }
impl<Op: DmOp> Response<Op> { ... }

unsafe impl<Op: DmOp> rustix::ioctl::Ioctl for Request<Op> { ... }

could become something like

trait DmOp: Sized { ... }
typevar Op: DmOp;

struct Request<Op> { ... }
struct Response<Op> { ... }

impl Request<Op> { ... }
impl Response<Op> { ... }

unsafe impl rustix::ioctl::Ioctl for Request<Op> { ... }

I imagine this would be handy any time there's a complex trait bound that needs to be written in many places.

(Note: This is meant to be strictly shorthand - the semantics of the second example are to be exactly the same as the semantics of the first example. I brought this concept up once before and the discussion went straight into the weeds because people thought I was proposing something with semantic implications.)

While I like the idea of an unambiguous postfix type ascription notation, it doesn't help with size_of.

I have the size_of case on the brain because I have been writing code that needs to use size_of and align_of a whole lot. size_of is a generic function where you always have to supply the generic parameter explicitly, and there are no normal arguments. To me, the turbofish feels extra annoying in the context of such functions, because it's not just an escape hatch for them, it's the only way to use them. Compare size_of::<c_int>() with sizeof(int) -- four extra characters to type (six counting the underscores) and all of them require the shift key.

Both the "nix the turbine" and the "square brackets instead of angle brackets" suggestions are motivated by this situation. size_of[c_int]() is quite a bit easier to type on my keyboard, and size_of[c_int] would be even smoother, if we could manage it.

Yeah, size_of and align_of are the rare case where you have to turbofish, and code that uses them often uses them a lot.

Personally, I sometimes wish that we could define a type-valued function (size_of(u32)), and in the absence of that, I think size_of!(u32) might be worth defining.

Just for context, this thread has a long-ish list history of attempts at removing the turbofish (the test itself was moved here)

1 Like

TBH, I think the better answer there is to just deprecate size_of. It's a legacy of back when we didn't have associated constants on traits; today there's no reason for it to be a function.

T::SIZE_IN_BYTES doesn't need a turbofish ever.

2 Likes

T::SIZE_IN_BYTES is not namespaced. Would it work with a trait in prelude (with appropriate fallback to avoid clobbering user-defined constants)?

Otherwise it may need to be <T as std::mem::SizeOf>::SIZE_IN_BYTES, which isn't pretty.

PHP has a similar syntax, but for a different thing – it uses \ for namespaces like Rust uses ::. I've used PHP for a while, so the proposed syntax looks wrong to me.

I don't think the <> syntax needs fixing. Rust doesn't have a problem with parsing >>. Error messages are pretty good at offering an autofix for a missing >.

Turbofish is confusing because <> has two syntaxes. Adding a third syntax, without removing anything, won't make it any less confusing.

7 Likes

Honestly I’m surprised size_of and friends are in the prelude at all. They’re not very useful if you stay in the realm of strong types, and they wouldn’t be hard to import when you need them. A trait you have to import would be fine, in theory.

Dunno. You could always add it as a constraint on the generic to bring it in scope when you need it.

Or use <LayoutInfo<T>>::bytes() or something instead, if people are really naming things SIZE_IN_BYTES often enough to cause a problem.

This might be a good reason for the special syntax you proposed for things like enum discriminants.

Either that, or some kind of "low-priority trait" mechanism.

Also, having a mechanism that makes conflicts less of a problem would mean this could be named something shorter, like SIZE rather than SIZE_IN_BYTES. (We don't write size_of_in_bytes, it's assumed to be bytes.)

I'll note that a hardware description language library we've been writing at libre-chip.org, fayalite, uses syntax like Array[AType[SomeOtherType[UInt[3]][Bool]]][my_vec.len()] as a runtime equivalent of Array<AType<SomeOtherType<UInt<3>, Bool>>, { my_vec.len() }> since different parts of it may not be known till runtime, e.g. my_vec.

a real example: fayalite/crates/fayalite/src/util/ready_valid.rs at cdd84953d076cd9a83db50e9d11a63b6181b0976 - libre-chip/fayalite - Libre-Chip.org

I would prefer a space over \, so you can write Option i32 instead of Option<i32>, or Arc Mutex T instead of Arc<Mutex<T>>. But I agree that this is not necessary; the current syntax is fine.

1 Like