Cleaner syntax for generics

userxfce · January 4, 2023, 5:02pm

Reason: cleaner syntax

Reference: template syntax in the D programming language Templates - D Programming Language

Proposal: adopt the instantiation (or even definition) of generics with "name!(type1,type2)" and make the parenthesis optional when a single type, that is allow "name!type1" instead of "name!(type1)"

Quick example 1: "to!u32(variable)" would be allowed, for example to convert/cast, but also to!any_type

Quick example 2: "type Link<T> = Option<Rc<RefCell<Node<T>>>>;" would become "type Link<T> = Option!Rc!RefCell!Node<T>;"

More elaborate (from Generics - Rust By Example) transformed example

struct A;
struct Single(A);
struct SingleGen<T>(T); //or struct SingleGen!(T)(T);
fn main() {
    let _s = Single(A);
    let _char: SingleGen!char = SingleGen('a'); //or SingleGen!(char), explicit type
    let _t    = SingleGen(A); // Uses `A` defined at the top. //implicit type (n0 change)
    let _i32  = SingleGen(6); // Uses `i32`. //implicit type (no change)
    let _char = SingleGen('a'); // Uses `char`. ///implicit type (no change)
}

Cons: possible conflict with macros syntax Pros: macros and generics concepts are somewhat related

steffahn · January 4, 2023, 5:21pm

I don't think it's a realistic proposal to fundamentally change the syntax of Rust generics. I don't particularly love the syntax we have right now either, in particular since <> are no actual parentheses so that editors, error messages, and macros have a harder time or worse support for handling them than for ()/[]/{}. But still, everyone is already used to it, and there's lots of existing code, so in a cost vs. benefits calculation for such a widespread yet mostly aesthetic and only syntactic change, the costs will always come out on top.

CAD97 · January 4, 2023, 5:56pm

Fundamental syntax changes like this aren't going to happen. Even if done over an edition and transparently interoperable with “syntax 1,” it introduces an incompatible breaking “syntax 2” that has all of the social drawbacks of a separate “Rust 2.0” language with none of the benefits.

“Rust 2.0” would be a softer migration than Python 2 to Python 3, because I expect the only way it could ever happen is as a transparently both-ways compatible successor language (like how bidirectional interop between Java and other JVM languages like Kotlin function, it'd share the same “Rust VM” specification), but it's still a giant migration that is not happening as part of the rust-lang project.

Even if transparent cross-edition interop maintains the letter of the 1.0 promise and of edition interop, it breaks the spirit of the promise. A key property of editions is that you can write code which is valid in any edition without losing semantic possibilities; it's just a bit more cumbersome. This is analogous to how you can write “fully elaborated” code which is resilient to the “minor breaking changes” allowed by e.g. the inference breakage of introducing new functions.

The fact that we're considering making inference breakage more difficult to run into via version-specification-sensitive name lookup should be enough to indicate that we want to make that “fully elaborated edition independent” dialect an user-invisible quirk of making cargo fix --edition function. A “syntax 2.0” migration breaks this ideal of never issuing breaking changes to the syntax and semantics of the Rust language.

The “edition2015 dialect” is minimally different from the “rolling stable MSRV dialect” or even the “stable MSRV fully elaborated edition agnostic dialect,” and a “syntax 2.0” dialect breaks that.

TL;DR: a drastic syntax change breaks the spirit of the 1.0 stability guarantee, even if it can be made to follow the letter of the guarantee via edition migration. This is such a change.

Musing on motivations for a “syntax 2.0” successor language

If anyone ever provides a “syntax 2.0”, it should be done via some compile-to-Rust or compile-to-stable-MIR separate successor language, a la Kotlin for Java or Carbon for C++.

Java was released in 1995; Kotlin 2019. Python 2 in 2000, Python 3 in 2008. Rust 1.0 was released 2015; I personally think nobody should realistically be trying to push a “Rust syntax 2.0” until 2035 (i.e. “syntax 2” has 20 years of PL design theory on “syntax 1”) matching the Java/Kotlin timeline. By the Python 3 timeline, edition2024 could introduce “syntax 2,” but we all know how poorly the Python 3 migration went. Bolstering the 20 year vibe for successor languages/syntaxes, Python 2 formally went EOL in 2020.

But also, such a successor language is only concretely useful if it extends (and/or removes deprecated footguns of) the type system in some fashion, like how Kotlin introduces nonnullable types on top of the JVM, or Swift migrates from Objective C. Just doing a “syntax 2” offers absolutely no benefit, since everyone now needs to be able to work with both “syntax 1” and “syntax 2.”

To this point, calling the successor language “Rust 2.0” is actively misleading. While the “syntax 2” language shares semantics with “Rust 1.0,” it's still a different language to learn, similar to migrating between Java/Kotlin, C++/Carbon, or even Javascript/Typescript as perhaps the most direct analog^[1].

Thus, like Carbon, Kotlin, and Typescript, “Rust 2.0” should never exist. A “spiritual Rust 2.0” successor language may have reason to exist by 2035, but it should have a separate name. My codename were I to work on such a project would be Patina. Feel free to figure out the meaning behind the name; it's relatively clever and not nonobvious. I'm not and not going to pursue such a project, though I did do some very minor (like, one afternoon) experimental exploration of what a “std 2.0” design could look like^[2] under that project codename.

There are plenty of people that argue strongly to prefer using the Typescript compiler to validate annotated Javascript is superior to using Typescript, because of in part the semantic weight of introducing a new “syntax 2.0” language. It's perhaps somewhat mitigable if the same process maintains/evolves “syntax 1.0” and “syntax 2.0” to avoid the issues where the Typescript compiler lags behind the Javascript evolution process, or where Javascript has assigned some meaning to syntax which already had different meaning in Typescript, breaking the strict superset quality, which is impossible to maintain without cooperation with maintaining the negative space in “syntax 1.” But the analogy for the difficulties exists, even if it's not perfect. ↩︎
In short, the perfect version of the portability lint and granular versioning of the standard distribution as separate subcrates you can use and declare manifest dependencies on. It's a significant amount of facade work to make it so “std::sync::mpsc version 1.0” and “std::sync::mpsc version 2.0” can share the same underlying implementation and both be used in the same compilation without undue duplication just to fix the known suboptimalities of the current std::sync::mpsc API, but for the core distribution, it might be worth it. ↩︎

jdahlstrom · January 4, 2023, 11:49pm

Never mind the fact that identifier!(params) already has a well-established meaning in Rust: macro invocation.

zackw · January 5, 2023, 7:18am

I have occasionally thought about designing a language (also called Patina, heh) that would be precisely "Rust, but with all the syntactic papercuts (in my incontrovertibly correct opinion) corrected, no other changes whatsoever." Partially because I think the exercise of enumerating all those papercuts and determining how to fix them would be interesting, and partially to put a stake in the ground and say "yes, this is a useful thing to be doing all by itself."

I think career programming language designers get into a headspace where the syntax becomes this uninteresting surface thing, which is true in a mathematical sense (there is a morphism from any concrete syntax whatsoever to S-expression trees). But from an ergonomics perspective, the surface is the means by which we grip the deeper structure, and if it's got rough edges, that's frustrating on a day-to-day basis, in a way that deeper flaws often aren't -- no flow typing? OK, I guess there's stuff we just can't do, then, and you stop thinking about it.

newpavlov · January 5, 2023, 7:47am

One potential backwards-compatible change for generic function is to allow passing of type and const arguments inside (..):

// This:
fn foo<const N: usize, T: Bar>(a: &str) -> T { ... }
let r1 = foo::<42, u8>("foo");
let r2: u8 = foo::<42, _>("foo");

// Optionally becomes:
fn foo(type T: Bar, a: &str, const N: usize) -> u32 { ... }
let r1 = foo(u8, "foo", 42);
let r2: u8 = foo(_, "foo", 42);

// You can mix both ways (e.g. if a generic type gets usually
// resolved from context):
fn foo<T: Bar>(a: T, const N: usize) { ... }
foo(1u32, 42);

Yes, we lose the clear distinction between compile and run time parameters, but arguably it's fine since all required parameters still get statically enforced. Unfortunately, it's quite unlikely we will get such extension.

afetisov · January 5, 2023, 10:59am

This can't work, because function parameters are never inferred, but type generic (and in the future possibly const generic) parameters are subject to inference.

That's purely subjective. From my perspective, the syntax of Rust is very carefully designed and quite pretty. It borrows heavily from the traditions of C++ and OCaml. If you are well familiar with those languages, Rust feels very familiar as well, and it has many of the syntactic warts of those languages fixed. Of course, if you consider C++ a token soup, you'd be unlikely to agree.

scottmcm · January 5, 2023, 4:52pm

This is the core of the issue to me.

Is there a better syntax than the current Rust one? Probably.

But look at how much noise even tiny deviations -- like ! instead of ~ -- from the C++ usual end up causing.

Rust being syntactically boring to a C++ programmer is part of its value proposition. Small changes away from that, even if arguably better in isolation -- I'm no fan of <> generics and their consequences -- make Rust worse overall.

A big rethinking of all the syntax could work, but probably won't ever happen. (No matter how much I'd like to fix struct intializers to use = instead of :.)

newpavlov · January 5, 2023, 5:08pm

What exactly can not work? The following code works today without any issues:

fn foo<T: core::fmt::Debug>(a: T) {
    println!("{:?}", a);
}

foo(1u32);

afetisov · January 5, 2023, 7:15pm

This works today, thanks to bidirectional type inference:

fn foo<T>() -> T { todo!() }
fn bar<T>(_: T) {}

let x = foo();
bar(&x);
let y: u32 = x;

If const generic inference is implemented, this will also work:

fn foo<T, const N: usize>() -> [T; N] { todo!() }
fn bar<T, const N: usize>(_: &[T; N]) {}

let x = foo();
bar(&x);
let y: [u32; 5] = x;

But this can't work:

fn foo<T>(const N: usize) -> [T; N] { todo!() }
fn bar<T>(_: &[T; N], const N: usize) {}

let x = foo(_);
bar(&x, _);
let y: [u32; 5] = x;

In principle, if we just take it as a syntax sugar for the current generic parameters, it's not an issue. But the use site is too similar to runtime parameters, and nothing like inference exists for runtime values, so it would cause too much confusion.

// Do we infer the value of `N`? Who knows! 
// Maybe it's just an unconventionally named variable.
foo(x, N);
// Maybe we require explicit `const` prefix?
// But this way we get even more use site boilerplate.
foo(x, const N);
// Or this way:
foo(x, const { N });

Ooooh, that's one of my favourite to hate things. The worst part is that I see absolutely no reason for that syntactic deviation. It doesn't match type ascription, or struct patterns. It has no precedent in C/C++, or ML. Just who the hell thought to do it this way, and why?

newpavlov · January 5, 2023, 7:43pm

This equivalent code again works without issues:

fn foo<const N: usize, T>() -> [T; N] { todo!() }
fn bar<const N: usize, T>(_: &[T; N]) {}

let x = foo();
bar(&x);
let y: [u32; 5] = x;

Also note that in my proposal _ works only for types and I am not 100% sure it should be allowed in the first place.

"This can't work" and "it may cause confusion" are two very different things.

Do we infer the value of N? Who knows!

No, if N is provided, then it must exist, so no inference is involved. You can not use "an unconventionally named variable" in place of a constant argument, it will be a compile-time error.

I don't think we need explicit const prefix at call sites in the same way as we do not need type annotations for variables (but const { .. } may be needed, if we want to evaluate a constant at call site). IDEs may hint it, but it's not a required feature for using such functions.

CAD97 · January 5, 2023, 9:44pm

To be extremely annoying, Rust actually does already stably support const function arguments in an extremely limited fashion. For the specific case of architecture vendor intrinsics, where the C intrinsic is defined to take a constant/literal value as an argument, the Rust version of the intrinsic also takes the constant argument as a parenthesized argument. As a random example, x86_64's _mm_sha1rnds4_epu32 (Intel's documentation), currently (1.66) defined in core as

#[allow(improper_ctypes)]
extern "C" {
    #[link_name = "llvm.x86.sha1rnds4"]
    fn sha1rnds4(a: i32x4, b: i32x4, c: i8) -> i32x4;
}

#[inline]
#[target_feature(enable = "sha")]
#[cfg_attr(test, assert_instr(sha1rnds4, FUNC = 0))]
#[rustc_legacy_const_generics(2)]
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_sha1rnds4_epu32<const FUNC: i32>(a: __m128i, b: __m128i) -> __m128i {
    static_assert_imm2!(FUNC);
    transmute(sha1rnds4(a.as_i32x4(), b.as_i32x4(), FUNC as i8))
}

and shows up in rustdoc as

pub unsafe fn _mm_sha1rnds4_epu32(
    a: __m128i,
    b: __m128i,
    const FUNC: i32
) -> __m128i

For historical information, the intrinsics (including the const-taking ones) were first made stable in 1.27; proper const generics were only stabilized in 1.51.

In 1.27, it was defined as

(with significantly more magic)

#[inline]
#[target_feature(enable = "sha")]
#[cfg_attr(test, assert_instr(sha1rnds4, func = 0))]
#[rustc_args_required_const(2)]
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_sha1rnds4_epu32(
    a: __m128i, b: __m128i, func: i32
) -> __m128i {
    let a = a.as_i32x4();
    let b = b.as_i32x4();
    macro_rules! call {
        ($imm2:expr) => {
            sha1rnds4(a, b, $imm2)
        };
    }
    let ret = constify_imm2!(func, call);
    mem::transmute(ret)
}

scottmcm · January 6, 2023, 1:59am

'Twas before my time, so I don't know confidently, but AFAIK the idea is that use matches definition -- so because it's defined with x: i32, it's used with x: 4 in expressions and x: mybinding in patterns.

zackw · January 6, 2023, 5:27am

Yeah, see, if I was actually going to do this hypothetical "Patina" language I would be arguing exactly the opposite, that language designers not only can, but ought to, diverge from the "C++ usual" when the C++ usual is objectively bad. Yes, it might take some getting used to, but why should we have to live with syntactic mistakes from the 1970s forever, just because the current generation of programmers are basically used to them?

The first five items on my list of Things What Should Change, just for concreteness, are:

The logical operators (&& and ||) should have equal precedence and it should be an error to mix them without parenthesizing.
Same for binary * / << and >> (multiplication, division, shift): equal precedence, must parenthesize to mix.
And the same again for & | ^ and +.
Unary * (dereference) should be postfix.
; should be a statement terminator, not a statement separator. You should have to write (); at the end of a block that's supposed to return (), unless that block is completely empty (see Better help message for "Mismatched types" [E0308] when the issue is function implicitly returning `()` · Issue #104739 · rust-lang/rust · GitHub for one concrete reason why this would be better).
Fields, types, and symbols should not have separate namespaces, although they should continue to be scoped by their container. (This one may make more sense if I describe it as "the set of name resolution changes that would make it possible to fold the :: operator into the . operator.")

(Mods: can we split this to its own thread maybe? Although I've said all I have to say about it at this point.)

mathstuf · January 6, 2023, 1:53pm

Can we get logical xor as well while we're at it? ^^

steffahn · January 6, 2023, 4:21pm

The “bitwise” operators work fine for boolean values, too. What && and || offer beyond & and | is short circuiting behavior. An exclusive-or cannot be short-circuited. Neither value of neither operand alone can tell you the result, you always need to evaluate both.

In fact, we already have two xor operators on bool, one is written ^ and one is written !=.

scottmcm · January 6, 2023, 6:39pm

All of the precedence things don't make it "look weird" to C++ programmers, though, so I think those are different. Notably, Vec!String looks weird to a C++ programmer in a way that Vec<String> doesn't, but (a << b) * n just looks like a coding standard, not a different syntax.

We could make those changes in an edition, if we really wanted, or do them via lints.

withoutboats · January 6, 2023, 8:05pm

(NOT A CONTRIBUTION)

I can say that when I worked on Rust this is definitely not how we thought about syntax. A lot of time - probably a disproportionate amount of time - was spent agonising over syntactic choices.

I think once a language reaches a backwards compatible state, substantial semantic changes are usually hard to discover but easy to reach consensus on. Something like GATs and how that sidestepped the currying problem was a breakthrough, but once it was figured out it was then easy to move forward with the design. On the other hand, there are a million small syntactic choices and it's very difficult to make objective arguments for their superiority so they really become the main points of design contention. And of course the fact that it requires much less specialist knowledge means the discussion of syntactic changes is much broader, with many different people tending to reiterate the same points of argument.

But no one thinks syntax doesn't matter. It's just a less exciting problem with more opaque criteria and a lot more noise around it.

jjpe · January 6, 2023, 8:38pm

offtopic:

Would you mind elaborating on this a bit? I've never connected GATs and currying as such.

withoutboats · January 6, 2023, 9:25pm

(NOT A CONTRIBUTION)

This blog post contains a description of the problem:

http://smallcultfollowing.com/babysteps/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/

TL;DR: HKT introduces a problem with type inference (inferring arbitrary multi-argument type functions is intractable); Haskell solves that problem using currying, hence it was called the "currying problem" (at first the only solution we knew of was to use currying). Rust doesn't have currying otherwise, so solving it that way seemed wrong for a bunch of reasons. GATs (at the time of this blog post called ATCs) provide another way to solve the same problem (as Niko alludes to later in the post).

Topic		Replies	Views
Raising the bar for introducing new syntax language design	106	7768	March 25, 2019
Ideas for making Rust easier for Beginners language design	39	6591	March 25, 2019
[lang-team-minutes] Const generics language design	83	18422	March 25, 2019
Don't keep complicating the syntax (soft post, maybe off topic, maybe irrelevant) language design	51	5606	March 25, 2019
Current syntax	17	5603	March 25, 2019

Cleaner syntax for generics

Related topics