Very nice article. Another way to improve the ergonomy of writing Rust code (that I think is not underlined enough in your article) regards the standard library. Some functionality is missing, or not the most intuitive to use or requires more code than your average Python programmer expects. In such cases you can usually improve the situation without changes in the Rust language itself.
I've listed few things here:
And here:
I can probably list some more ones.
Definitely! But the post was focused just on talking about what the language team is hoping to do on this front. We have separate initiatives on the library side, which will appear on the blog soon!
Another thing we should be alert to with respect to the “reasoning footprint” is the potential interactions between different “implicitness mechanisms”. The footprints of two features can both be reasonable in isolation, yet Yeti-sized when they interact. In particular, my experience was that deref coercions already made the ref
and ref mut
situation more confusing than it probably should’ve been, because it didn’t seem to make any difference how many ref
s I used or didn’t use. It was OK from the perspective of getting working code, but not so much for understanding the rules and being able to write code with confidence. I’m not sure if additional implicitness around ref
would exacerbate this situation or even alleviate it (through “destructive interference”), but it’s something to bear in mind.
And tangentially to the main point:
“Usually you will be well aware of these bounds anyway, and when using a function like use_map
, you’re generally going to be passing in an existing HashMap
, which by construction will ensure that the bounds already hold.”
Did you have some specific problem in mind which would rely on this “by construction” logic for resolution? Because it’s quite easy to write a use_map_maybe<K, V>(map: Option<HashMap<K, V>>)
, and then call use_map_maybe(None)
. (I.e. the crux of Java’s recently-discovered unsoundness, as I’m sure you’ve read about.)
With implicit vs explicit there’s split between what I want to write, and what I want to read.
I want to write absolute minimum. I want everything implicit when I’m writing the code. Every kind of magic is fine, since I know the code I’m writing. In that context I want Rust to fill in all the blanks for me, so I can focus on the big picture and not the tedious declarations bureaucracy.
I want to see more information when I read the code. When I read the code (someone else’s code, or my own code sometime later) I don’t know exactly what things are references, which types are Copy
, what cleverness is hidden.
Lifetime elision in functions is a prime example of this. I hate writing lifetimes. I don’t want to write them anywhere, ever. But when I look functions up, I do care whether the function returns an owned or borrowed value.
So perhaps this tension can be solved with tooling? Allow implicit syntax to be written, but change it to explicit in rustdoc, rustfmt, etc.
I am worried about implicit mod.
mod does not duplicate filesystem hierarchy information, because mod is (to me) more like Makefile in C projects than package/module in Java/Python.
Rust projects often implement portability layer by using conditional compilation (cfg attribute) and module renaming (path attribute). That is, sys module is unix.rs on Linux, but windows.rs on Windows. It would be bad if Rust tries to include windows.rs on Linux build just because it’s checked in the version control so lying around in the filesystem. So implicit mod at the very least needs mechanism to prevent such implicit inclusion of filesystem files as modules.
I'm a bit torn on this. I don't like typing (or reading lots of repetitive code) but I've been burned by implicitness many times (*caugh* rails *caugh*).
implied bounds
Pretty please ? However, given:
fn use_map<K, V>(map: HashMap<K, V>) { ... }
I would not allow the programmer to use the fact that K: Hash + Eq
except in the context of HashMap<K, V>
. For example, I'd forbid some_key == some_other_key
unless the programmer explicitly stated K: Eq
.
Discarding ownership
Along the same lines as @kornel's advice above, I'd strongly advise against implicit borrows ("discarding ownership") that outlive the called function.
As much as I hate writing &
, I don't want to write...
let my_string = {
let tmp_string = String::new("thing");
some_fn(tmp_string) // Implicit borrow.
};
...only to have the compiler yell at me because tmp_string
doesn't live long enough.
Also, when refactoring code, what if I decide that I need a copy of tmp_string
? From reading this code, I'd have to assume that I need to clone it but that's not the case. However, to be fair, you can (and the standard library often does) use AsRef
to achieve the same effect).
Borrowing in match patterns
Unless I'm missing something, I think this would be a disaster.
Infer the need for ref (or ref mut) based on the borrowing usage in the match arm.
What if we're dealing with copy types?:
let mut thing = Some(1);
let thing_ref = &mut thing;
match thing_ref { // implicit deref?
Some(item) => { // implicit ref mut?
while item > 0 {
item -= 1;
// Do something
}
}
}
Does this mutate item
?
much like we do for closures already
We don't. We automatically choose between ref
and ref mut
but we have a move
keyword for actually moving.
Thinking along similar, but more radical lines, an argument could be made about the need for mod itself.
From a why bother perspective, this is something you have to learn once and then never have to think about again so I don't think it's really worth it.
From a foot-gun perspective, what if I define a module with a bunch of exported symbols (C FFI)? You could say "only compile modules that are either used or declared" but now dropping a "use my_c_ffi_mod;" will silently drop the symbol. However, I admit this is a bit of an edge case.
From a code browsing standpoint, being able to see the module structure without looking at the filesystem is actually quite nice.
I like the discarding ownership idea. It makes sense to me that, since you can pass a &mut
to something that just wants a &
, that you could also “downgrade” a (stronger) T
to a &
. Having foo(bar)
mean “I don’t need bar any more after this, so do what you must” sounds great, and keeps (conceptual) liveness short.
For @stebalien’s lifetime error point, maybe the feature could be limited to arguments of functions that return non-lifetimed types. Then the error would be something like "cannot auto-borrow temp_string
in call to some_fn
that returns a Foo<'a>
" instead of getting a borrowck error pointing to an implicit borrow. (That rule is probably overly cautious, but that’s fine to start. Someone smarter than me will surely figure out how to define some kind of “because an output lifetime depends on that borrow’s lifetime” rule.)
I guess it wouldn’t auto-&mut
? Mutating something just to drop it immediately sounds likely to be unintentional, even if it would be perfectly safe.
@stebalien Regarding the borrow in match patterns example, the implicit deref at match
is a core part of the proposal and less controversial. Though what to infer in Some(item)
is a good question. Since item
is Copy
we have a choice between mut item
and ref mut item
. Inferring mut item
looks like mutpocalypse. I think inferring ref mut
is what a closure would do. Not sure if it's best to infer nothing or to infer ref mut
. But non-Copy
types have no choice here, so it seems reasonable to infer ref mut
for them.
We do, closures move from the environment if they have to, move
is to force the closure to take ownership when it dosen't have to.
I'd suggest that you can only discard ownership if the lifetime parameter is not required to outlive anything. I think that's enough to define "lifetime parameter that will never get you in trouble with borrowck", addressing @stebalien's concern.
It should auto-&mut
, the caller doesn't necessarily care about what happens to a &mut
parameter. Consider vec.append(other: &mut Vec<T>)
. This function takes &mut other
for maximum flexibility, since you may want to reuse the space allocated for other
, but most of the time you are done with other
and would rather discard ownership.
First of all, a general warning: while it sometimes helps ergonomics, implicitness often hurts learnability.
- It means more rules to learn (what the compiler will infer in any given situation).
- The compiler will "guess" more often so code may not work as intended (if the compiler guesses incorrectly) instead of simply failing to compile (in which case the compiler will often tell you what you should have done).
Note: Implicitness can make languages easier to use up front (and thus learn) when the implied behavior is obvious and uncontroversial. However, that only really applies when the implied behavior really is truly obvious (only one reasonable implementation).
In the case of "Borrowing in match patterns", I'd argue that this is neither obvious nor uncontroversial. It's also not something that's hard to learn given that the compiler will literally tell the programmer what to do. From an ergonomic perspective, the deref/ref dance isn't really that much of a hurdle (compared to the rest of the boilerplate in rust) so I'd argue that it simply isn't worth it.
Hm. I guess I've never noticed that.
Since item is Copy we have a choice between
mut item
andref mut item
. I think inferringref mut
is what a closure would do.
Any solution that requires the programmer to remember "this type is copy, this type isn't" really shouldn't be considered. Closures always act as if you did a ref mut
/ref
until you copy/move the value (at which point they obey the standard copy/move semantics). The only reasonable solution here would be to do the same (i.e., don't condition inference on Copy
).
Since this wasn't explicitly specified I'll explicitly rule it out: unless rust does an implicit deref in the match statement, it can't do any of this inference in the match arms without breaking everything. If you want to opt-in to this auto-ref behavior, you'd have to explicitly ref (or start with a reference).
That is, given:
let x = Some(thing);
the following:
let x_ref = &x;
match x_ref { // Could have been `&x`
Some(item) => ...,
None,
}
would always be equivalent to (nothing inferred except mut
):
let x_ref = &x;
match *x_ref {
Some(ref (mut?) item) => ...,
None,
}
Even in this restricted version, this is still significantly more magical than closure capture because whether or not we have item
by reference depends on the type of x_ref
. The blog post argues:
And in any case, it’s still quite local context.
However, this isn't correct. I could have:
struct Wrapper<'a>(&'a Option<u32>);
// ... Lots of code (even multiple files)...
fn foo<'a>(w: &mut Wrapper<'a>) {
match w.0 { // Implicit dereference.
Some(count) => { // Implicit ref mut
while count > 0 {
count -= 1; // Mutate by accident...
// Do something
}
},
None => {}
}
}
Today, rust would give me a type error (and tell me to deref w.0
). With this feature, this code would silently break.
Some other ergonomic pain points for consideration (none of these are fleshed-out proposals).
Terse Functions
Single-function impls and brace-less pure (?) functions would help with writing constructors:
struct MyType { a: u32, b: u32 };
pub fn MyType::new(a: u32, b: u32) -> Self = { a, b };
Delegation
This has been brought up again and again for a reason. I know this is hard and requires a lot of complex design but it would really pay off.
Smaller “use” headers
If you look at any reasonably complex rust program, the number of use
statements at the top is a insane and hurts readability (you have to crawl through a bunch of use statements before you can get to the actual code).
One way to improve this would be to have some way to say “always make methods from these traits available on this type” when defining a type. This way, programmers wouldn’t have to, e.g., import std::io::Read
to read a file.
That is, in std
:
pub struct File: Read + Write + Seek { ... }
In my crate:
use std::fs::File;
fn main() {
let mut buf = String::new();
File::open(...).read_to_string(&mut buf).unwrap();
// ...
}
For better learnability, these methods could even be inlined (into the type’s impl block) in the documentation.
A non-Copy
value is moved if used as a value in a closure, so standard copy/move semantics there. But closures capture Copy
types by reference when used as a value in the closure, unlike standard copy/move semantics:
let mut x = 1;
let x_ref : *mut i32 = &mut x;
// Copy semantics say x is copied here, so get_x should always be 1.
let get_x = || x;
unsafe { *x_ref += 1; } // Unsafe but not unsound.
println!("{}", get_x()); // And yet this prints '2'.
To get standard copy semantics you need to make it move || x
. The point being that the language already special-cases inference around Copy
types in closures.
edit: But you make good points, and I now think it's best to have the same behaviour for Copy and non-Copy, otherwise adding a Copy impl would break the code.
"Digging deeper, there’s a vital cross-cutting concern: empathy. The goal here is to try to imagine and evaluate ways that Rust could be different. To do this well, we need to be able to put ourselves back in the shoes of a newcomer. Of someone who prefers a different workflow. We need to be able to come to Rust fresh, shedding our current habits and mental models and trying on new ones.
“And, perhaps most importantly, we need empathy for each other. Transformative insights can be fragile; they can start out embedded in ideas that have lots of problems. If we’re too quick to shut down a line of thought based on those problems, we risk foreclosing on avenues to something better. We’ve got to have the patience to sit with ideas that are foreign and uncomfortable, and gain some new perspective from them. We’ve got to trust that we all want to make Rust better, and that good faith deliberation is the way to make productivity a core value, without sacrificing the others.”
I really liked this ending to the post.
I think a corollary is that, just as we should be mindful not to reflexively dismiss potential changes out of a visceral reaction to difference, or out of fear of the unknown, we should also be prepared to accept the possibility that, after careful investigation and thorough discussion, we might decide that few or none of the potential changes are actually beneficial on net, and worth making. That is, we should be wary of both “this is how things are, and it’s how they were meant to be” (status quo bias), as well as “we must do something, and this is something, therefore we must do this” (anti-status quo bias, if they call it that).
Apropos "guess": The "guess" RFC would allow us to have the compiler emit a "guess" with code replacements, that, when applied, do the "right thing" in most cases. Together with some nice colors for the code in whichever IDE is used, this helps figure out the correct boilerplate, without making it explicit and hides the boilerplate through an appropriate color scheme.
A non-Copy value is moved if used as a value in a closure, so standard copy/move semantics there. But closures capture Copy types by reference when used as a value in the closure, unlike standard copy/move semantics
The way I see it, that Copy
behavior is following normal copy semantics: it copies when used by value and no sooner. The non-Copy
case is actually the odd one because it eagerly moves into the closure and drops the non-Copy
value at the end of the closure. However, this difference can only really be observed on Drop
. That is, you'll get an eager drop at the end of the closure if the value could have been used by value in the closure. To illustrate, given:
struct DebugDrop(&'static str);
impl Drop for DebugDrop {
fn drop(&mut self) {
println!("{}", self.0);
}
}
Compare closure drop semantics:
fn main() {
let a = DebugDrop("dropping a");
let maybe_drop = || {
if false {
drop(a);
}
};
maybe_drop(); // "dropping a" printed (eager drop)
println!("end"); // "end" printed.
}
With normal drop semantics:
fn main() {
let a = DebugDrop("dropping a");
if false {
drop(a);
} // an eager drop would print "dropping a" here (but rust doesn't do that).
println!("end"); // "end" printed.
// "dropping a" printed (normal drop)
}
So you're right, the current behavior of closures does (subtly) deviate from copy semantics. However, I'd argue that it doesn't do so in a way that forces the programmer to remember "this type is copy, this type isn't" (any more than with usual control flow).
My concern with switching on Copy
in this case is that, if you did, the following cases would behave wildly differently:
let mut number = Some(1);
match &mut number {
Some(num) => {
num: u32;
},
None => {},
}
let mut string = Some(String::new());
match &mut string {
Some(s) => {
s: &mut String;
},
None => {},
}
In closures, it would be s: String
(and s
would be moved).
FYI, I believe your example is technically unsound as you're mutably aliasing a shared reference. You can achieve something similar with a Cell
(but not quite the same because Cell
isn't copy).
use std::cell::Cell;
fn main() {
let mut x = Cell::new(1);
// Copy semantics say x is copied here, so get_x should always be 1.
let get_x = || x.get();
x.set(2);
println!("{}", get_x()); // And yet this prints '2'.
}
IMO, a better way to demonstrate this is by mutating from within the closure:
fn main() {
let mut x = 1;
{
let mut get_x = || {
x += 1;
x
};
println!("{}", get_x()); // 2
}
println!("{}", x); // 2
}
@ker
IntelliJ does something similar to rewrite non-idiomatic Kotlin to idiomatic Kotlin and I think this is a great way to teach people a language (tooling that helps programmers write code).
BTW, there’s now an RFC for the proposed match
changes.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.