[Pre-Pre-RFC] The ..Default::default() shorthand is misleading

That's very true. I can speak only for myself by saying that after some 3.5 years of using Rust, this misunderstanding of the syntax was the hardest to clear up until now. The special difficulty is that my alternative interpretation of the syntax made too much sense to me and that I had not seen that kind of syntax before. This is only from my side of course.

That's a good point. This would actually argue well that it makes sense to keep the syntax this way, assuming they are reasonably similar.

Yeah, that is what I tried to achieve when using the struct update syntax without understanding it. But yeah, not what I wanted to argue for here, even though a solution for this would be useful as well.

It's not reasonably similar. To get the equivalent to Rust's struct update syntax in JS, you have to put the spread operation first, otherwise it won't work:

// Rust
Foo {
    bar: Bar::new(1),
    ..default
}
// Javascript equivalent
{
    ...default,
    bar: new Bar(1),
}

The JS syntax is also much more powerful:

  • Objects with different fields can be merged by using the spread syntax multiple times: { ...foo, ...bar }

  • You can define which field overwrites which: { ...foo, bar: 5 } has a different meaning than { bar: 5, ...foo }

However, ... doesn't copy the prototype, so it doesn't work with classes.

2 Likes

[quote="Aloso, post:33, topic:15917, full:true"]

Yes, but all of this properties originate in the fact that js is dynamically typed while Rust is statically typed.

To a large extent spread operator could be made to work with Rust. For example the case of merging different structs could be implemented by mapping fields by name. This doesn't rely on dynamic properties:

struct A { a: i32 }
struct B { b: i32 }
struct AB { a: i32, b: i32 }

AB { ..a, ..b }

It's a bit magic, and more like structural rather than nominal typing that Rust uses, but I would actually find it useful to copy fields from a "builder" struct to a struct it builds.

1 Like

I remember seeing an RFC for that somewhere, but like you said, it's magical (not that I'm personally against this). Naturally it should be the same type.

5 Likes

Yeah, that was what I meant.

Please also note the edit on my first reply to you, as I guess you will not be notified about that edit by the forum software.

So you are saying my question can only be answered with yes, because people not knowing about something means they can only guess randomly, right?

There are two points I would like to make here. One is that syntax can be guessed right without having learnt about a specific piece of syntax. And the second is that even if there is no "logical" way for people to guess what a syntax does, their guesses might still be distributed different than 50:50.

So what is an example of syntax that can be guessed right without even knowing about Rust? I would argue that all the basic arithmetic operations like 10 / (2-3) can be understood by anyone with a bachelor's degree in computer science or related engineering, even if they never heard about Rust before. And we can continue, method calls are also pretty standard between programming languages written by a.b(), and calling freestanding functions by writing simply f(). Also declarations of functions are a pretty standard thing to do in programming, so if someone reads fn x() -> u32 {...} they will probably understand that u32 is a return type, even if they have only seen languages like Java, Python or C++ before, where the return type is notated differently. So for all these examples, my question would be answered with no by most people I would expect. And, assuming that many languanges would use the same struct update syntax as Rust, then many people would also answer "no" to my exact question, stating that many other languages do the same and therefore most people know how it works.

For the second point ("even if there is no 'logical' way for people to guess what a syntax does, their guesses might still be distributed different than 50:50"), which is more what I was arguing in my opening post, I would like to stress that people are biased. These biases can be manyfold and can have many sources, and they are not evenly distributed as a rule. I believe that if we would present an example of the struct update syntax to people that do not know it but do have some programming background, and give them two possible interpretations to choose from (e.g. mine from the opening post, and the correct one), the answers are not as a rule distributed 50:50, but there would be some bias. And I also believe that different syntactic ways of expressing the same thing will produce different biases.

These biases do not come magically, but they come from the biases in the people themselves. For example a JavaScript developer might more likely guess the struct update syntax correctly since there is a similar syntax in JavaScript.

While I understand that designing a sound syntax system for a practical programming language is an immense challenge on its own, I believe that, given that resources are available, designing for these human biases is also a valuable goal.

Let's consider the example of some syntax X in some programming language L, and a poll with a large enough cohort of people that often use L but have never heard of syntax X. If 95 percent of people in that cohort would interpret syntax X wrongly, then I think it would be appropriate to call syntax X "misleading".


Coming back to the matter at hand:

It would be helpful for the discussion if you would give some arguments for that, and/or refute the arguments I made in the opening post.

Remember that the syntax was proposed as a replacement, so it will not be redundant in the long term. About intuitivity, see my next message.

This is a good point. I agree that the originally proposed syntax with two words was not the best idea, having just one keyword would be better. And right, the precedence rules might actually make for a more complicated syntax in the end.

True, two-word operations do not fit into current Rust, and are also not necessary for solving the problem I posed.

2 Likes

No, I am saying that answering "no" will make people uncomfortable, because of how the question is formulated. It "can," of course, technically be answered by "no", but that makes the respondent look silly, since the phrasing itself suggests that it is "obviously" confusing.

This is correct, except that it doesn't support your point. All of these examples actually assume prior knowledge and/or experience of the cited syntax from math or other programming languages. Those who weren't exposed to either programming or math won't know what o.f() or fn x() -> u32 means. It's obvious to us who have experience with ≥1 other languages, but for someone whose first programming language is Rust, it's not any more obvious than FRU.

Same here – I don't see how or why, without actually conducting representative, scientifically sound research on how different syntaxes are distributed, one could claim that a replacement syntax would be an improvement. Based on this argument, it might be better, or it might be worse, or it might be just as good/bad as the status quo. And since you are proposing the change, the burden of proof to show it's a substantial and worthwile improvement (or an improvement at all) is on you. One shouldn't have to go out of one's way to defend the status quo.

I did that by explaining why, in the 3 points in the subsequent paragraph.

In that case, such strongly breaking changes are IMO absolutely off the table.

5 Likes

How would you have phrased it then? I have argued for the "yes" answer and then asked if people agree. That seems like a very natural thing to me if I want to propose something.

About your two longer paragraphs: I do not intend to argue anymore for a syntax change. That long argument I made was merely meant as a response to you being unhappy with one of my opening questions. It was not intended to be supporting (or refuting) anything related to the matter we are discussing. If you want to complain about my way of discussing, then please at least keep the meta separate from the actual discussion.

You were only talking about the syntax change. There are other solutions to the problem I raised, like warnings, lints, education or documentation that people have also considered. If you answer "no" to there being a problem at all, then you also answer no to fixing it with a warning, lint, some education or some documentation. Please clarify if your "no" was merely towards the syntax change, or if it was also to there being a problem at all.

Evaluating the results of the poll

I posted a poll on URLO about 3 days ago, trying to see if beginners would intuitively guess the struct update syntax correctly in the case where it is used as ..Default::default().

I specifically asked about the following code example:

struct Tree;
struct Bush {twigs: i32}

struct Forest {
    tree: Tree,
    bush: Bush,
}

impl Default for Bush {
    fn default() -> Self {
        Self {
            twigs: 2
        }
    }
}

impl Default for Forest {
    fn default() -> Self {
        Self {
            tree: Tree,
            bush: Bush {twigs: 1},
        }
    }
}

fn main() {
    let forest = Forest {
        tree: Tree,
        ..Default::default()
    };
    println!("{}", forest.bush.twigs);
}

This is where the answers currently stand, where 1 is the correct answer:

I also decided to separete people by their roles in URLO to estimate their knowledge of the language. Unfortunately, due to rounding errors, the numbers don't add up to 100%.

We can see that roughly 80% of all respondents got the question correct, and roughtly 70% of those that say that they "rather learn" from URLO got it correct. This leaves us with roughly 30% of less experienced respondents that got the question wrong. We cannot know if those people just did not know the answer and guessed randomly, or if they guessed wrongly with confidence. However I believe that it shows that there is a certain ambiguity that should be addressed in some way to make Rust code more readable in this specific case.

Summarising a bit

In general people seem to acknowledge that there is a problem with the case I rised, but the general mood seems to be against a syntax change. I myself am now also of the opinion that that would be too much, especially since I cannot provide good evidence for it changing things to the better. That is, I do not have the resources or the necessary background to make a reliable poll about this topic. This makes it impossible to answer if the current syntax is intuitive or not to many people, and therefore there is no good argument for making a big change like this, at least in this thread.

People have also discussed that warnings or lints would be helpful. To give more details here, lets first distiguish the two main cases.

The first case is the potential unconditional recursion case that looks like this:

impl Default for World {
    fn default() -> Self {
        Self {
            tree: Tree::grow(),
            ..Default::default()
        }
    }
}

The second case is the one that might produce unexpected values if misunderstood, which I also used for the poll:

fn main() {
    let forest = Forest {
        tree: Tree,
        ..Default::default()
    };
}

About the potential unconditional recursion, the following things have been said:

  • Three people (other than me) have rised issues on github about a missing warning for this case. Their issues can be found via this issue. Them rising these issues can be clearly interpreted as them being in favour of such a warning.
  • Seven people (including likes) wish for more compile time checks for this case: and eight people acknowledge that the existing warning works only in very simple cases:
  • Six people (including likes) think that the corresponding warning could explain more or link to an explanation:
  • Eleven people (including likes, but since the message is longer it is hard to say what the likers actually meant) think that the main focus of a solution should be outside of the compiler or linter (albeit it is not clear if this was said against changing syntax, or also as a general comment):

About the case with unexpected values, not much has been said here. However in the discussion of the poll on URLO, people seem to mostly agree that a lint for this case might be helpful, specifically if I understand correctly people propose the following (original idea seems to be from user ZiCog):

<typename> {
    /* ... */,
    ..Default::default()
}

should produce a linter warning suggesting:

<typename> {
    /* ... */,
    ..<typename>::default()
}

People have also argued that this might be problematic in the case where <typename>::default() is a different function than the first. People have continued to argue that both implementing default() on a type and the Default trait on that same type is not ideal.

The poll itself hints "that there is a certain ambiguity that should be addressed in some way to make Rust code more readable in this specific case," as presented in the first section of this post.

And there is one argument that might relate to either case:

Updating the proposal

Using the hints above, I would now propose the following:

We take over the idea of the lint from URLO. The lint needs to be clever enough to use fully qualified syntax <<typename> as Default>::default() in case the type implements default() as well. (There is then still a danger in case the implementation of default() on the type happens after writing the code with the struct update syntax, as it would then change the meaning of that code. But that is a general problem whenever using <typename>::default() like that, and despite that risk, there is already a lint that proposes the use of <typename>::default() in general. Also implementing default() on a type in Rust is surely not idiomatic.)

I propose that the lint lives under category clippy::style in the general case. In the recursive case, I propose it to be in clippy::correctness, because I believe that hardly anyone would write a recursive implementation of Default::default(), so anyone who triggers this lint is more likely to have run in the same misunderstanding as I did.

The lint should explain the misunderstanding that can happen when using ..Default::default(), and have at least a link to the respective section in the book, or should explain the syntax with an example on its own.

I will not comment anything about the compiler warning, as it is not required to fix the problem I raised, assuming that this lint gets added. I am not against improving the warning though.

Please tell me:

  • Do you believe that this is an adequate way to address the problem at hand?
  • Would you propose changes to the lint?
2 Likes

Note that these categories do AFAIK also prescribe the default lint level. Lints often start out as allow-by-default, especially if there might still be any quirks with the lint, or it may be too annoying; in particular the one that's always warning about ..Default::default() should probably do that, too. (I think instead of style it would be pedantic then.) Similarly correctness being deny-by-default might be too harsh. After all, the unconditional recursion warning from the compiler is also only a warning. In this case, if it's supposed to be warn-by-default the appropriate category would probably be suspicious instead of correctness.

1 Like

I would actually say that should be fixed in the compiler, since it meets my bar of "running the unit tests is useless" if there's an infinite loop because of the recursion.

It should still be a lint, because that kind of analysis isn't worth turning into a hard error, but any method that unconditionally recurses to itself should trigger a deny-by-default lint -- whether that be Default::default, Add::add, or anything. (Well, unless it's -> !, I guess.)

6 Likes

I don't think that a hard error is the best idea. IMO, during development, "unconditional"1 recursion is a valid state of a half-finished recursive function where you didn't define the base case yet; I'd like to be able to type-check such a partially-written function. And if I want to type-check my code, then IMO deny-by-default lints are super annoying. I'm usually turning clippy's deny-by-default lints into warnings, too. I don't do that for rustc's lints, because there's also many "this will become a hard error in the future" kind of deny-by-default lints.

I don't believe that the compiler should be allowed to always assume that right after cargo check the next step is going to be running the tests. As mentioned, I might deliberately type-check code in a state where I definitely don't want to write the unit tests next.

1Which is not even going to be 100% truly "unconditional" anymore, if the lint is going to be expanded to be "smarter", and detect something like fn foo(x: u32) { foo(x - 1) + foo(x - 2) } even though x - 1 technically might panic, preventing the recursion. (This is an example of a half-finished Fibonacci function, in a state (still missing the base case, including its condition) where I'd expect it to type-check, i.e. compile without error.)


Of course it isn't a hard error. No weird heuristic that sometimes somehow detects some cases of - well - infinite loops and/or stack overflows, I guess - at compile time can reasonably be considered an actual part of the Rust language, instead of just a mere lint. How would you be going to teach the language if that's one of its features? Suppose I want to teach someone about stack overflows, using an (unconditionally) recursive functions would be a great demonstration. I want to run this example to demonstrate my point in such a setting. I also want to run it without jumping through some weird hoops to disable a lint that screams "error" instead of "warning", or by refactoring the code in the most trivial way (I'm imagining e.g. adding an if true condition would always do it) in order to outsmart the weird heuristic that somehow got promoted to an essential factor in what constitutes an error-free Rust program (by default).

That's a very different definition of "type check" from the one I use.

Such a lint, to me, would only happen after the type checker and borrow checker and move checker have already passed -- so if there were any type problems or lifetime problems or similar, those would be reported. Thus your cargo check workflow wouldn't be impacted at all, except that when everything else is happy, you'd get the "this always recurses" deny lint, at which point it's useful.

All it'd keep you from doing is running the code, which is good because things hanging from an infinite loop is really annoying. The minor cost of putting in a todo!() or similar so it's no longer infinite before you run it is well worth is. (Especially if you're doing trait implementations where you're delegating to **self < *other or something, where the wrong number of *s can easily be accidental recursion.)

3 Likes
error[E0369]: cannot add `()` to `()`
 --> src/main.rs:1:29
  |
1 | fn foo(x: u32) { foo(x - 1) + foo(x - 2) }
  |                  ---------- ^ ---------- ()
  |                  |
  |                  ()

Point made, though.

But more constructively, perhaps a case could be made to have a more general warning for all ‘unconditional panic’ situations, in which case it becomes moot that a panic may prevent stack overflow, as the code is diagnosable under those criteria (it either panics due to integer underflow or due to stack overflow). People are known to have requested similar features elsewhere.

1 Like

See, that's why I need to the type-checker to help me out, even in such a half-finished state :sweat_smile: I know the function is supposed to pass the compiler at that point; if it doesn't I've made a mistake. If I hadn't made a mistake and the compiler started throwing hard errors (i.e. anything that's not just a warning) at me anyways, because a (necessarily) dumb heuristic thinks it's smarter than I am, then I'd be highly annoyed.

3 Likes

The lint could easily be smarter than I am, when it comes to these things -- it knows that it didn't actually deref like I thought it would, or similar.

By running after name resolution and type checking (which are basically the same phase, because of methods) you would get the type error there.

Sure, if the lint is poorly implemented it could be annoying. But that's true of every lint ever. And this one doesn't even seem particularly complicated to implement quite reliably -- it's just walking the HAIR/MIR and looking for a branch before it calls its own defid.

1 Like

I meant "dumb" with regards to, e.g., that the lint could probably always be circumvented by inserting if false { panic!() }. I.e. "dumb" means that a static compiler simply cannot, in principle, know all that much about the runtime behavior of a program; naturally many programs that will unconditionally recurse can't ever be caught by the lint. And I meant "dumb" with regards to the fact that the compiler cannot know whether my intention is e.g. just to double-check my intuition of "this code should currently be in a valid state, it definitely won't pass any tests, but let's ask the compile if I made any errors so far", or if I do want to run (all the) tests next, and that's why I'm invoking ... well ...

actually, if I'm going to run tests, I'll cargo build my code first (or call cargo test directly), so when this heuristic would error (by default) on cargo check, based on the condition "running the unit tests is useless if there's an infinite loop", then it would always be wrong. I'm not building the code, so I won't run tests next. If every workflow of using rustc would always imply “if the program compiles without error, then the next step is going to be running the unit tests” (which is kind-of implied by a rule of “if running the unit tests does not make sense yet, then the compile may as well return an error instead of a warning”), then cargo check might as well always automatically subsequently invoke cargo test if it's successful. For some reason cargo check and cargo test are separate commands though, so maybe this is not everyone's workflow when working with rustc?

I know I would get the type error, too, and I know I wouldn't get a type-related error if "type checking" succeeds. But I also know that rustc's diagnostics don't really give me a good overview beyond "code compiles warning-free", "code compiles with warnings", "code doesn't compile". There's no good warnings/error prioritization, e.g. showing all the errors first, showing more important warnings first, possibly configurable too, indicating things like how far compilation came. If rustc clearly and concisely told me "hey, this code type-checks, there's even no hard errors either, only a few deny-by-default lints" then I couldn't care less about whether this is an error or a warning. Until this is the case, the distinction "code compiles" vs. "code doesn't compile" (i.e. "only warnings" vs. "also some errors") is hugely important.

Currently, as long as code doesn't compile, sometimes we don't even get all the error messages (because compilation didn't get as far), so it's an important rule-of-thumb when effectively working with the compiler to always eliminate all the errors. E.g. a syntax error might inhibit a bunch of type-checking, or similar, so I can't just ignore a handful of errors unless I want to risk that the code I'm working on isn't really checked at all. Thus if I want to regularly make sure that my code type-checks, what I have to do is make sure it compiles without any errors at all; at least I'm not aware of any other practical approach.


On a similar note, sometimes I want to format my code, and get surprised that I have some syntax error somewhere. I already ran cargo check on the code and saw the output, but my brain only saw "lots of errors", so I'm unaware that some of the errors are syntax errors, and surprised for a moment why automatic formatting with rustfmt doesn't work. I would appreciate if I already knew whether rustfmt will succeed by looking at the cargo check output; without needing to try the rustfmt myself first. Until that's the case, I guess "running rustfmt" is the way to explicitly ask the compiler "is there a syntax error or not?".

If rustc gave me an overview stating e.g. things like has syntax errors (red) or no syntax errors (green); has type errors (red) or no type errors (green) or some type checking inhibited (yellow) [inhibited e.g. by syntax errors], and so on, and I could immediately tell by the color pattern of this overview whether or not my code type-checks, then – fine – go ahead and start making more lints that catch code in a "running unit tests is useless" state into deny-by-default. Then I can still do my type-checking (and actually interpret the output without having to rely mainly on the "only warnings" vs "there are errors" distinction). Until then, I stand by my point that

and I hope this discussion above also explains my approach of the almost-synonymous use:

because AFAICT, the only easy way to tests whether your code type-checks (without needing to read all the output of cargo check) is by seeing whether or not it compiles without error

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.