Struct-syntax

This is inspired by box syntax. My idea is creating a kind of trait that, when implemented, lets you call a T::new()-like function with just writing a name that identifies the structure.

This makes easier creating structs that just need one value for being created.

For example:

#[derive(Debug)]
struct MyStruct {
    field_1: i64,
    field_2: i64,
}

impl MyStruct {
    pub fn new(x: i64) -> Self {
        Self { field_1: x, field_2: x * 3 }
    }
}

impl StructSyntax<i64> for MyStruct {
    const initializer: (&'static str, fn(i64) -> Self) = ("my_struct", Self::new);
}

fn main() {
    println!("{:?}", my_struct 1 /* the same as MyStruct::new(1) */);
}

// this would be somewhere in `core`
trait StructSyntax<T> {
    const initializer: (&'static str, fn(T) -> Self);
}

This way of implementing the idea is a backwards-compatible change because T::new still exists and is just called under the hood.

Also, it will make it easier and more simple to create boxes (making box syntax obsolete), and will be more coherent by creating in the same way Rcs, Arcs, and that stuff.

box_syntax is unstable and its tracking issue mostly features discussion on how (and when) to remove the feature from the language entirely (as well as discussion about placement new). I think most people are happy with using T::new functions. Thus, AFAICT, the chance of getting your idea / proposal into the language is vanishingly small.

19 Likes

I agree that this would need a high bar to change. Importing a module which makes some variable names unusable?

let box = 1; // Not allowed because of `box` syntax
let my_struct = 1; // Not allowed just because I imported your module? No thank you.

This also has big knock-on effects for things like rustfmt (it now needs to resolve use references, look for impl StructSyntax, read the implementation, and then inject that into its formatter), rust-analyzer (though it has the infrastructure), or any other Rust-reading code for that matter. They'd all need to basically compile the code to even make an AST (which may require cross-compilation or something).

Another issue:

#[cfg(feature = "foobar")]
use module_which_declares_syntax;

#[cfg(feature = "foobar")]
fn foo() {
    let x = my_struct 1; // How do we make an AST of this when the `foobar` feature is off?
}

Sorry, but I feel that this is a bridge too far.

2 Likes

i think that the syntax should only be loaded only if the struct is in scope, not if you just import a module that has one inside.

A better example:

let x = my_struct (3, 4);  // create MyStruct with an (i32, i32)
let y = some_function(3, 4);  // function call with two arguments

Basically the only way to make this even feasible is if you said, "well, my_struct (3, 4) can't be allowed then, it has to be my_struct ((3, 4))." Or, the compiler's internal representation has to model it as some sort of MightBeFunctionCallOrStruct until enough of the type system has been resolved to disambiguate between the two.

(imagine how awkward this would be if we had named arguments!)

You are basically attempting to let users assign a new lexical class to certain identifiers, which is the same problem that makes C famously difficult to parse. Only, in rust it would be far far worse because we have to deal with path resolution, type inference and trait solving.

4 Likes

I actually like that Rust doesn't have any special syntax for constructors. This makes new() just a regular function. I can use new() -> Result<Self>, or new_with_foo(Foo), and it's still the same regular syntax.

If syntax sugar for bare new existed, there would be immediately a feature request for overloading to support my_struct foo and my_struct (bar, baz), and overloading by return type for my_struct ? and my_struct .await, etc. just to avoid losing the special favorited syntax because an object can be constructed in a few ways or asynchronously, etc.

25 Likes

That is even more complicated for rustfmt because now it needs to know use statements everywhere, know what symbols are introduced by each, then know whether there's a StructSyntax in force in that scope.

Indeed. This is also problematic.

Other fun things that come to mind for StructSyntax<T>::initializer.0 are listed below. Who errors for these things?

  • let
  • {} (or other syntactic soup)
  • ":slight_smile:"
  • <right-to-left> control character
  • \n
  • ;

This is obviously not "just" a &'static str, but it has some other restrictions applied to it. It really feels like a new literal that we don't even have today.

1 Like

FYI, you can already do this, if you want:

#[derive(Debug)]
struct MyStruct {
    field_1: i64,
    field_2: i64,
}

#[allow(non_snake_case)]
fn MyStruct(x: i64) -> MyStruct {
    MyStruct {
        field_1: x,
        field_2: x * 3,
    }
}

fn main() {
    println!("{:?}", MyStruct(1));
}

(That's what tuple-structs do under the covers.)

5 Likes

We need to be explicit that while this is possible, it is entirely unidiomatic. It’s even worse than OPs suggestion which at least featured the name being in snake_case. Seeing a PascalCase function call like this makes be believe it’s a tuple-struct constructor, not executing arbitrary code; not breaking this convention is valuable in making Rust code easy to understand.

The correct way to do something like this while completely staying within the realm of acceptable rust code would be to use a regularly-named freestanding function. [Sometimes this approach can even be completely idiomatic, e.g. std does it sometimes (though arguably in these cases the struct is named after the function and not the other way)].

#[derive(Debug)]
struct MyStruct {
    field_1: i64,
    field_2: i64,
}

fn my_struct(x: i64) -> MyStruct {
    MyStruct {
        field_1: x,
        field_2: x * 3,
    }
}

fn main() {
    println!("{:?}", my_struct(1));
}
4 Likes

Isn't it what tuple structs do?

Tuple structs do this, kind-of, but the behavior is predictable (e. g. not implicitly allocating anything), and the constructor can be used in pattern matching, too.

4 Likes

I definitely wouldn't use it for something like a big database record, but there are places it's reasonable.

For example, I wouldn't complain about

struct Color { r: u8, g: u8, b: u8 }
fn Color(r: u8, g: u8, b: u8) -> Color { Color { r, g, b } }

where positional initialization is reasonable but you still want named fields.

8 Likes

But then the only difference is the absence of ::new, which might only have significant impact, if a significant portion of your code consists of struct construction. And even if that's the case, I see no reason to subvert expectations and use CamelCase for a free function.

4 Likes

Proposed syntax looks similar to the way they create stuff on the Haskell side. I really liked Haskell's brackets-are-optional approach. Don't have much of an opinion on how this would go here. I believe that in general the less the brackets, colons, and semicolons, the fun the language is to experiment with and create mind bending stuff.

I’ve done this, but always with a snake_case name. So a struct Color {…} has a constructor shorthand fn color(…) etc.

2 Likes

That goes hand-in-hand with currying every function by default. I wouldn’t know how to do this otherwise. Haskell’s syntactic ability to avoid parentheses for function calls also goes hand-in-hand with its laziness, so you don’t need any () -> T functions.

Without laziness and without currying, you could directly adapt one thing: no need for parentheses for single-argument functions (but they’d be needed for multiple-argument or zero-argument). That’s a rather narrowly applicable syntax then and would probably be a weird quirk if introduced to Rust.

In other words, technically all functions in Haskell are single-argument functions (and that’s why its syntax works great there), while Rust differentiates different numbers of arguments on the most basic language level.

6 Likes

Yeah, the Haskell is too pure to be practical which Rust is. I think the action would probably happen on the Haskell side in future, where they create some more language extensions to copy rust's innovation of ownership/lifetimes. Currently they seem to be busy with trying to work around dependent types and absorb Agda. I also have some ideas that I would like to experiment around dependent types and narrowing, current versions of languages could do a lot if only they could understand predicates that they are working with well enough before going entering a scope.

I thought Haskell syntax was inspired by ML, so you should be able to find the desired lack of parenthesis also in SML and OCaml. I understand these were an inspiration for Rust. It appears ML style parenthesis-less syntax however had been rejected in favor of C/C++/Java inspired one.

I have to concede that for a Java guy OCaml looks pretty difficult to read, at least as you start dealing with it. This is even more true for Haskell of course.

1 Like

The point of box syntax is to have box patterns. Otherwise it's just more syntax to support.

1 Like

box syntax consists of two constructs: box expressions (Rust's equivalent to C++'s new expressions) and box patterns.

box expressions are currently (at least in theory) still more powerfull compared to Box::new(), as they could support inplace constructions and the direct construction of boxed unsized types.

However, we do have agregate expressions and the ::new() method convention, so I aggree, that we don't need new syntax here.