[Idea] Improving the ergonomics when using Rc

hyeonu · January 27, 2019, 1:59pm

Rc::clone() causes pointer indirection, which shouldn’t be considered as trivial in performance critical code, due to the possibility of cache miss.

dhm · January 27, 2019, 2:52pm

Regarding ergonomics there are other paths that would be more lightweight while remaining explicit:

Custom macros:

macro_rules! rc_vec {
  (
    $($expr:expr),*
  ) => (
    vec![ $(::std::rc::Rc::clone(&$expr), )* ]
  );
}

and then use rc_vec![foo, bar, baz]. This way, any reader can know something special is happening since the classic vec macro is not the one being used.

A macro-less solution would be to use iterators:

let vec: Vec<Rc<_>> =
  [&foo, &bar, &baz]
    .iter()
    .map(|&x| x.clone())
    .collect();

Both may seem more cumbersome but scale better.

If you are writing you own function, you can then require that the arguments be references to Rcs (and clone them in the first line of the function’s body) so that calling those functions just requires to write &foo instead of foo.clone().

I think that any “easier” or more lightweight way would do more harm than good, since implicitness is the very footgun that rust fights against.

CAD97 · January 27, 2019, 4:02pm

let vec: Vec<Rc<_>> =
  [&foo, &bar, &baz]
    .iter()
    .cloned()
    .collect();

notriddle · January 27, 2019, 6:35pm

hyeonu already mentioned the pointers indirection, but as an additional note:

SSA compilers (which is basically all of them, including LLVM and Rust's MIR) are very good at removing redundant copies when no pointers are involved. If there are pointers involved, then you have to explicitly *ptr dereference it to copy it, so it's not implicit any more.

dhm · January 27, 2019, 11:16pm

My bad, it was .map(|&x| x.clone()) instead of .map(|x| x.clone()) (references being Copy prevents auto-deref), which means that .cloned() won’t work either, unless it’s used twice (iterator over &&Rc<_> which just get dereferenced instead). I hope we’ll end up having .into_iter() for arrays too.

withoutboats · January 28, 2019, 1:35am

I've long been inclined to agree with the original author of this thread that there should be an autoclone mechanism, and that Rc should opt into it. The expensive part of reference counting is the initial allocation; incrementing and decrementing the reference count is comparatively cheap in contrast to the cost of, say, memcpying a [usize; 1024] , which Rust will let you do implicitly without any warnings at all.

I think the current design of the language encourages the wrong behavior, the false pretense that anything that has costs is "explicit" in Rust leads users to believe that any Copy operation is cheap, whereas cloning an Rc is expensive. The current design misguides users into avoiding heap allocation dogmatically, even when it would be more performant than the memcpys they're doing.

If we actually believed that everything expensive should be explicit, even optionally, we would have some allow-by-default warning that warned on all but the genuinely cheap copies (anything over a word, or whatever), and probably other things (like operator overloading on types other than primitives). I don't think this should be a priority to implement (maybe someone who does would like to write a clippy like tool!), but it would at least be intellectually honest, unlike the rigidity of pretending what we currently have made implicit is a well-reasoned set of cheap operations.

I doubt its that hard with the analyses we're already doing with NLL to only clone when the value is used again without reinitializing it. In any event that could be a design constraint. If you wanted to error when you did it by mistake putting such a wrapper up on crates.io would be trivial.

scottmcm · January 28, 2019, 1:43am

I think we do want that, and have wanted that for some time:

github.com/rust-lang/rust

Lint for undesirable, implicit copies

opened 01:30PM - 01 Nov 17 UTC

nikomatsakis

A-lint T-lang C-tracking-issue S-tracking-design-concerns

As part of https://github.com/rust-lang/rust/issues/44619, one topic that keeps …coming up is that we have to find some way to mitigate the risk of large, implicit copies. Indeed, this risk exists today even without any changes to the language: ```rust let x = [0; 1024 * 10]; let y = x; // maybe you meant to copy 10K words, but maybe you didn't. ``` In addition to performance hazards, implicit copies can have surprising semantics. For example, there are several iterator types that *would* implement `Copy`, but we were afraid that people would be surprised. Another, clearer example is a type like `Cell<i32>`, which could certainly be copy, but for this interaction: ```rust let x = Cell::new(22); let y = x; x.set(23); println!("{}", y.get()); // prints 22 ``` For a time before 1.0, we briefly considered introducing a new `Pod` trait that acted like `Copy` (memcpy is safe) but without the implicit copy semantics. At the time, @eddyb argued persuasively against this, basically saying (and rightly so) that this is more of a linting concern than anything else -- the implicit copies in the example above, after all, don't lead to any sort of unsoundness, they just may not be the semantics you expected. Since then, a number of use cases have arisen where having some kind of warning against implicit, unexpected copies would be useful: - Iterators implementing `Copy` - Copy/clone closures (closures that are copy can be surprising just as iterators can be) - `Cell`, ~~`RefCell`, and other types with interior mutability implementing `Copy`~~ - Behavior of `#[derive(PartiallEq)]` and friends with packed structs - A coercion from `&T` to `T` (i.e., it'd be nice to be able to do `foo(x)` where `foo: fn(u32)` and `x: &u32`) I will writeup a more specific proposal in the thread below.

withoutboats · January 28, 2019, 1:54am

My suggestion that we don’t want it has to do with our resourcing of the idea, not with its inherent worth. That lint is evidently not a high priority for the project.

Overall, I want to push back - hard - against the idea that what we made implicit/explicit initially was a perfect match for what is cheap/expensive. It was roughly accurate, but in many finer details it was wrong. I would like to adjust the defaults so that doing the right thing doesn’t feel like wearing the hairshirt (which is the case with Rc today), and for the cases where you need extra guard rails against overly-large copies, allocations, function call operators, dynamic dispatch, or whatever else, to provide targeted assistance optionally through newtypes or lint plugins.

skade · January 28, 2019, 9:03am

I was always under the impression that Copy/Clone was always more about about complexity. Copy of [u8; 24] is as complex as [u8; 1024], semantically speaking, whereas Clone hints to something custom happening.

I always teach Rust as a language where stuff is “pretty visible”, as compared to other languages, but there’s still tons of things going on in the back (for good reason) and I think it’s important to keep that discussion of where visibility is useful going. “When to clone RC” is definitely a subject that needs to be taught and I think there’s still good wins to be made. I’m not sure if moving all of Clone to be optionally implicit is it, though.

H2CO3 · January 28, 2019, 9:14am

I beg to differ, it's not about "efficiency". It is, as I mentioned earlier, about triviality. I have to agree with @skade here:

It is also pretty much not the case that "the current design misguides users into avoiding heap allocation dogmatically". That is an exaggeration at best. I never regarded explicit .clone()s as something to be "avoided dogmatically" (nor does e.g. the book say that!); sometimes you just have to clone, and sometimes it's better to clone. The important thing is that you know it's there — and that is exactly what this proposal would forcibly take away from current users.

josh · January 28, 2019, 9:19am

Right. It's helpful to know that nothing "magic" is going on with y in the expression x = y or f(y), and they just do exactly what they look like. No copy constructors, no implicit code being run. It's not just that Rc::clone is more expensive than a simple memory/register copy (though that's also important), it's that it runs custom Rust code rather than being entirely a compiler-generated copy. Knowing where code is being run is important.

@scottmcm I agree entirely; that lint would push us more in the direction of explicit clones rather than implicit copies. I certainly wouldn't want to see us going in the opposite direction.

cksac · January 28, 2019, 9:23am

For Rc, it's drop method is doing -1 to its reference counter which is opposite of Rc::clone. However, the drop is implicit. Do we required to make the drop explicit? And the proposal here is we add a tool in compiler to allow some type (e.g. Rc) implicitly cloned, not allowing every T: Clone to be implicitly cloned.

skade · January 28, 2019, 9:28am

Well, there’s a not so small group that would prefer complex drops to be explicit in Rust (it’s not as simple as it sounds though). At least, it’s big enough of a discussion that there’s blog posts on the subject: https://gankro.github.io/blah/linear-rust/

The current model there does come with its problems, like the inability to report errors from drop.

Your proposal would allow any Clone type to opt into being implicitly cloneable, which may lead people just adding it by default on anything that is Clone, but not Copy. It does weaken an existing boundary.

cksac · January 28, 2019, 9:32am

As I mention before, we can make ImplicitClone like Copy a marker trait , which type implementing it is controlled in std. Not allow arbitrary type implement it unless all fields implement ImplicitClone .

H2CO3 · January 28, 2019, 9:57am

Drop is deterministically invoked at the end of the innermost scope (unless the value to be dropped is moved from or std::forgetten), therefore one does know when it's doing its magic even if it's implicit.

There is another (perhaps more important) reason: Drop is associated with "teardown", ie. that is exactly the time we stop caring about a value. When we stop caring about a value, it's way less important what is happening to it than it was before.

Nevertheless, implicit Drop does have the problem of invoking arbitrary code. It is possible to perform sneaky or even malicious things in a drop impl, nothing actually prevents you from doing so as far as I know.

However, and this is my third point, the benefits of the existence of an automatic cleanup mechanism enormously outweigh this disadvantage, so we can live with it: it's a net win. The same is not true about implicit cloning. The only advantage of implicit cloning would be slight writing convenience, and that is not nearly enough to justify the loss of control it would result in.

I do understand your proposal, please don't reiterate it for the third time. I do know that not every clonable type would be implicitly cloned under this proposal. However, I am still arguing that the current implicit copying capabilities of Rust are sufficient and correct, and that no type that is non-Copy should be cloned implicitly.

phaylon · January 28, 2019, 1:46pm

Unfortunately, both of these can implicitly run code already in a deref coercion context, but that is another reason to not add to the amount of hidden code invocations.

Personally, I think clone usually has a semantic meaning, and so should always stand out.

withoutboats · January 28, 2019, 1:52pm

Rust is full of implicit, user overloadable code execution points:

Drops, which are normally completely implicit
Dereferences, which can occur as part of coercions
All operators are overloadable

I don't see how autoclone is of a different category from these operations. I believe, in contrast to your claim, that it is more consistent with the overall design of Rust to have hooks for users to override the behavior of copy.

You make one statement in particular which seems mischaracterizing of my position:

This is only true in the sense that a user could make any type autoclone, not in the sense that it would be anything but extremely unidiomatic for anything that does a deep clone. I would see Rc and Arc as the only types in std that should be autoclone. This is exactly the same as the situation around the Deref trait, which always you to run arbitrary code at so many different points in the program, and yet we have successfully formed a clear community consensus not to misuse this feature.

It's a perfectly valid position that all of these things are mistakes and Rust should have made a different choice, but thats very different from arguing that the current distinction made between these and copy is good, right, and consistent. I think what would even out Rust's design and be the most advantageous for users would be to include an autoclone hook, use it extremely sparingly (basically only for reference counting), and introduce additional community supported tooling for checking for all of the things users could possibly care about that happen implicitly.

For what its worth, I don't see any connection between linear types and drop being "explicit" or not. I don't think "drop is too implicit" has ever been a motivation for exploring linear types in Rust.

H2CO3 · January 28, 2019, 2:41pm

The fact that we do have examples of implicitness in the language doesn't mean that it's a good thing to introduce more of it or that we need to/should do it. I would argue that in this context, the "should be used sparingly" argument applies on the language design level even more than on the everyday language user's level, and instead of making it possible to implicitly clone Rcs and warning people not to abuse implicit cloning (which someone will almost surely do eventually), we should just not introduce implicit cloning for Rc.

The consistency argument doesn't really stand here either, because:

they (cloning, drop, and deref) are indeed completely different aspects, and
there are not nearly as many benefits to implicit cloning as there are to implicit drop or deref. For example, Drop is pretty much a necessity for reliable automatic resource management (and as such, it's whole point is basically implicitness), and Deref is also the best reasonable implementation of accessing the contents of smart pointer-like or collection-like data types. In contrast, implicit cloning would only allow one to not type .clone(). That is definitely not a comparable advantage (if at all – I think it's a downright disadvantage).

withoutboats · January 28, 2019, 2:58pm

Alternative designs for both drop and deref are trivial to imagine:

We could error if you did not call a .drop() method when a non-Copy value goes out of scope. That way it's explicit, but also guaranteed to occur properly.
Without deref coercions, you would "just" have to call .as_str() and .as_slice() whenever you wanted to coerce a smart collection into its reference type.

In both scenarios, I am completely certain that there would be people diminishing the advantage of inserting this code for you as "only allowing one to not type [.drop()/.as_slice()/etc]" and insisting on the importance of being able to see immediately in every instance of source code when a coercion happens and when a destructor is run.

And we even would have adapted code patterns around this limitation; for example, probably Vec and String would have a lot of methods of str and slice duplicated, to avoid the annoyance of calling the conversion methods. When we proposed just letting you call slice and str methods on these types, we'd hear: you don't need to call those conversion methods very often anyway, and its just a little duplication in the standard library. It's so much more important to not have this conversion code being called implicitly all over the place, what if someone's deref impl was really expensive?

skade · January 28, 2019, 3:40pm

The connection was that @cksac mentioned it as an example of implicitness, nothing more.

FWIW, I do actually agree with most of your points. In contrast to operators and deref, though, Copy already has no visible marker and Clone would enter that realm with this change.

My point is mostly a) that felt there's at least on different thread of understanding of our early communication around Clone and Copy, which does influence nowadays thinking (potentially, even both variants were communicated), that one being that complexity is the differentiator and b) that I do see a possible future were we fail to establish community practice. Then, Clone and Copy can essentially be merged. I don't want to speculate on how much we gain or lose in this case.

To be clear: I generally err on being a little conservative here, but I'm fine with either way that suits people and conveniently shields them from bugs.

Topic		Replies	Views
Feedback/Brainstorming: Enhancing Trait Object Ergonomics language design	4	685	March 7, 2021
Allow disabling of ergonomic features on a per-crate basis? language design	59	3905	March 25, 2019
3 questions on ergonomics language design	3	942	March 25, 2019
Rc::clone(&r) or r.clone()? libs	54	6131	May 7, 2023
`move` operator for ergonomic captures language design	43	1702	August 20, 2024

[Idea] Improving the ergonomics when using Rc

Related topics