[Idea] Improving the ergonomics when using Rc

I know what you mean by this but I want to point out that it isn't strictly true: you can identify every site of potential copy if you try, and its similar work to properly identify every site of potential deref coercion. For example, foo(arg) could perform a deref coercion (depending on the type of arg and the type expected by foo), just as it could perform a copy.

I definitely get intuitively that copy feels less visible than deref coercions, but I think its important to consider that in the finer details it isn't. How often have you actually scoured your code for deref coercions, and encountered how nuanced it is to actually find them all? I know I never have. This seems worth keeping in mind.

2 Likes

I'm quite paranoid about things like this in all my unsafe code. I hope I'm not the only one. The reason I don't do it in other code is because it's a lot of work to find all occurances, and because it's basically impossible to enforce certain patterns long-term in all of the code.

1 Like

I wrote earlier that I think we should have targeted forms of tooling for specific code sections in which any kinds of implicit code insertions are significant to the user. One way we already do this is by the design of the types intended for use in unsafe - in my experience writing unsafe code, I am usually using types like raw pointers and NonNull and so on which do not implement Drop or Deref or any operators (and would not implement any kind of autoclone mechanism either). Another way we could do this is by providing more lint plugins that target this specific use case.

Once those lint plugins exist, if a user wanted to apply them to their entire code base, they’d certainly be within their rights (though I would recommend against it).

3 Likes

I’d certainly appreciate the lints. I think we should have them either way, but not everyone thinks so.

For the specific topic of cloning, I would probably even go in the other direction and would like more distinction between Copy and Clone. For example, I’d love to have an Iterator::copied that is the same as Iterator::cloned except only operating on Copy types. That way I can skip thinking about what I’m actually cloning when reading the code. I already try to not use .cloned() for things like copying numbers when other semantically important cloning is going on in the surrounding context.

4 Likes

I think thats perfectly reasonable and not at odds with autoclone existing. I wouldn’t want to subsume Copy into the autoclone mechanism; there are still plenty of cases in which its useful and autoclone would not be (most obviously as a bound).

4 Likes

Yeah, it certainly wouldn’t block it and there are indeed ways to satisfy all needs. I guess it would be a personal complication for me when reviewing others’ code.

An anecdotal reason here would be because the Rc cloning specifically helps me detect nonsense in internal APIs that grew during iteration. For example a trait method taking a Rc because at one point every implementation stored it, until one didn’t and a &Rc was better. So it makes it easier for me to communicate my intent (at this point mostly to my future self).

But I think we generally agree on the possibilities and existances of trade-offs in all the scenarios.

As an aside, big thanks for giving consideration to the downsides it would have for some people, and their concerns, and considering how to counter them. I know I usually just nag when I feel these things aren’t given enough weight.

1 Like

We discussed this in our internal chat.

  1. I believe its very important to improve ergonomic of Arc/Rc, because it would make leaning curve much less complicate for beginners of Rust.
  2. It would be perfect to have some special AutoClone trait which is called implicit in some cases like Copy if its necessary. But its better to leave it just for internal usage.
  3. To avoid long Rc<RefCell<…>> it could be possible to add some modificators like let rc or let arc or something like that, but this part is not directly connected to auto-clone.

I reread the tread and agree: Universal AutoClone solution could be dangerous if its available for user-defined structure, but its not so dangerous for rc+=1 problem. +You know that only Rc could be autocloned, not anything else at the moment. Thats why, it could be better to avoid universal AutoClone solution, but just apply RcAutoClone. If it not heavy-operation - I do not see why ā€œexplicit advantage of Rustā€ did not make Copy explicit also. Its even mentioned in RustBook (or rust-by-example?)) that we some some code which should not work, but we have hidden Copy and that is why it works.

Hello

I might have a problem that wasn’t mentioned here (if I’ve overlooked it, then I’m sorry) which looks more severe than how visible it is.

Now, when reading the code, I can locally decide if each bit of code moves or makes a clone. This is moving:

let a = b;

This is making a clone:

let a = b.clone();

This distinction doesn’t hold true for Copy types. But that’s fine, because for a type to be copy, it must:

  • Have no custom code inside its cloning
  • Have no destructor

So while I can’t decide if I’ve moved or copied, that doesn’t matter because there’s no way to observe the difference between the two. So I can actually think I’ve done both and be happy about that.

This however is not true with Rc that can copy itself implicitly. There’s observable difference between moving and copying Rc. Let’s say I have this code:

let a = Rc::new(42);
let b = a;
println!("{}", Rc::ref_count(&b));

Right now, this code is legal and prints 1 ‒ because I’ve made a move. So this code still must print 1 after introducing the change. Therefore, let b = a; is a move.

Except that the implicit copying would very much like this code to compile:

let a = Rc::new(42);
let b = a;
println!("{}", Rc::ref_count(&b));
// One million lines of code goes here
println!("{}", a);

But for that to compile, the let b = a; must have been a copy. But then, the first println must have printed 2. So, to decide if the first line prints 1 or 2, I have to read to the very end of the function and then I know what operation it has done at the beginning. While the compiler is probably capable of tracking this, I very much have problems with travelling in time while reading the code.

To add a yet bit more scary code:

let a = Rc::new(42);
let b = a;
let x = Rc::try_unwrap(b).unwrap();
// One million lines of code...
#[cfg(feature = "extra_logging")]
debug!("Value is {}", a);

Turning on the extra_logging feature makes it suddenly to start panic in code that is nowhere close to any code the extra_logging feature adds and looks completely harmless. Happy debugging.

And another thing to consider. While Rc's +1 might be cheap, the distinction of cloning and borrowing of Arc is significant. I’ve worked on a C++ codebase that heavily used shared_ptrs. We were passing them by value (eg. making copies). It turned out we were needlessly spending about 5% of our run time in the shared_ptr destructors and by passing them by reference instead that dropped to about 2%. So while the +1 is cheap, there still should be a way for the programmer to explicitly force the compiler to either move or copy depending on his choice. And it would be quite weird for Rc to be able to auto-clone and Arc not.

9 Likes

Thank you for coming up with a very concrete example. Yes, this is exactly my problem with anything that introduces implicitness after the fact: patterns that we have learnt, love, and use, might now suddenly start breaking. That issue is just too underrated (even if I had trouble formulating a specific example myself).

I feel that this discussion is currently way too focused on ā€œBut refcounts are cheap! – But they aren’t!ā€, and that proponents are largely missing the point that it’s not only computational expense that matters. First and foremost, correctness matters, and this is exactly the sort of subtlety that could affect correctness, that is, without the euphemism, introduce bugs.

Incidentally, I don’t even want to think about the effects on unsafe code and its consequences, for instance – if you used a similar integer incremented during cloning as an index with the assumption that you know where copies happen, then newly-introduced, accidental clones could even cause OOB memory access.

3 Likes

Its not a problem in Swift. It can optimize and remove unnecessary rc += 1.

Rust knows its value is used after move or not: let b = a in your first example, will compiler fail without AutoClone? No => no need for RcAutoClone::clone() call.

About seconds example: would not it fail on first .unwrap() if a is cloned? Anyway, if a is used after move - its cloned.

That's exactly what I'm trying to say. The fact that the let b = a; acts differently depending on code that happens after that, that the reasoning is not time-linear. The future is influencing the present. That might be fine for the compiler, but it's not fine for my poor brain.

4 Likes

Swift literally has optimization passes baked into their compiler to detect and remove redundant reference-counting manipulation.

Which, now that I think about it, makes me wonder if it would be plausible to make Rc a lang item and do the same thing in Rust, or if it would be out of the question.

2 Likes

Iterator in std::iter - Rust :slight_smile:

6 Likes

You’d run into opposition of the ā€œRust has no GCā€ crowd. Jest aside, the Rc and Arc interfaces in stdlib do already strike me as something that you may want to replace in some code, especially as they are a little wasteful when you don’t need weak refs. So there would be need for a more general interface.

2 Likes

Naive AutoClone might actually fly, but baking Rc-elision in as a stable interface means either leaving optimizations on the table in the name of a simple specification, or leaving the specification open to future rc-elision tricks and thus creating a giant back-compat hazard for any data type that opts into this interface.

An interface that allows the compiler to elide unnecessary clones, without actually proving that the elision is unobservable, sounds like the sort of excessively clever pothole that C++ would use.

4 Likes

The autoclone patterns here would be extremely useful for certain non-std library types as well. For example, I’m currently writing a procedural macro which attempts to make clone() calls unnecessary for functions which clone a specific wrapper type which should be as transparent as possible to the wrapped value. I don’t doubt that similar tools will proliferate if Rc/Arc get this syntax elision but custom smart pointers can’t. If some combination of ideas here gets RFCd, I would hope we decide on making a trait available to implement outside of std.

1 Like

That seems like an orthogonal issue. Rc is not a beginner tool to bypass thinking about borrow checking, it's a tool for use cases when you absolutely need it and can't structure your code to avoid it.

7 Likes

While that's potentially a good idea, it's also a workaround for relying so much on reference counting, and there's a difference between a useful optimization and something practically required to write efficient code.

2 Likes

I agree than Rc should not be main way to go, but I suppose that its necessary to reduce a lot of boilerplate around the wrapper. A lot of .clone() and Rc<RefCell<…>> in every type and definition. That is why I suppose that let rc modificator could be good. You use Rust like current Rust if you need performance and control, but if you need Rc, which is a common case also, you can add it just by changing let to let rc.

But, again, I am absolutely agree that it should not be main way to do things, but let rc market could be something like unsafe, which points that the code could have some special behavior.

I’ll avoid retreading the repeated explanations for why .clone() is important, but to comment on the latter: I would never want to see anything hide those types. They’re different types, and it should be blatantly obvious when looking at them that they use Rc.

2 Likes