Rust 2030 Christmas list: Inout methods

Discussion on r/rust

5 Likes

From the post:

my personal view is that if you’re only duplicating one or two line, adding abstraction to remove the duplication isn’t worth it

But then what do you justify this kind of feature with? Apart from trivial getter/setters (which are non-idiomatic and rare in the first place) and the 3 standard ref-to-ref conversions (AsRef, Borrow and Deref), there doesn't seem to be much space for methods that may be either mutable or immutable and share the same implementation. Mutable references are different enough qualitatively, in their nature, that allowing mutation vs. allowing sharing doesn't usually warrant copy-pasting any non-trivial implementations. And adding a separate language feature for them seems like an overkill.

How would they transition to inout mutability?

The simplest solution would probably be to add the inout keyword to the xxx() version, and eventually deprecate the xxx_mut() version

Please don't suggest that. The _mut() suffix is actually a handy mnemonic marker to gently remind the reader that the method may be mutating. Taking this away would decrease the clarity of the code.

15 Likes

I have a parser combinator library, and as part of the design of that library, I had to decide whether to use Fn or FnMut for the closure types. It's a trade-off between convenient to use variable capturing and inability to parse in multiple threads, or allow parsing in multiple threads, but variable capture needs everything wrapped in Arc and friends.

It would be nice if I didn't have to make this tradeoff, and a proposal like this on the surface appears to be trying to solve a problem like that, but then the only motivation presented is to make it trivial to write getters/setters? There are much better ways to trivialize getters (e.g., 'properties') if that were a desirable goal (which I would argue is not.)

But I can't even make sense of the following:

inout types are only allowed in methods, and their mutability must be the same as the self argument.

If you can only mutate with &mut self, and you can only share with &self, what can you do with &inout? It doesn't seem to do anything except be a syntactical inference about a question with only one (obvious) answer in the first place?

And I have to second the point that just removing _mut is not an improvement. In addition to the fact that it is inherently helpful, you'd be asking all Rust programmers to adapt to the fact that they'd have to reinterpret all previously existing code to account for the presence of a new API design possibility. Which while not being a breaking change, is still a burden to place on people.

(Addendum: I hope people will take the time to understand this last point, because language stability is not just about semver compatibility. It's also about not burdening developers to have to make new decisions when looking at old code, and it's one of the reasons that new syntax options need strong motivations; having new ways to do old things forces everyone to ask whether the old way or new way is correct for code that is largely 'finished'. It creates churn and debate about things that previously didn't require it, which makes people feel as though they should be waiting for the language to mature even when it is already serviceable.)

11 Likes

I think the point is that if you have e.g.

fn get(&self) -> &Thing;
fn get_mut(&mut self) -> &mut Thing;

they actually get lowered to the same machine code. They have different language semantics, but this doesn't affect execution, it just affects which programs are accepted and rejected (e.g. reject programs with shared mutable references). The function (this one probably gets inlined, but in general) could end up duplicated in machine code, and as the OP says, it leads to a larger vtable, requiring more memory with all the knock-on disadvantages associated.

You could address these issues by looking for duplicate functions, and unifying them (maybe this already happens, in rustc or LLVM), or you can have some way of linking the mutability of arguments to mutability of return types, the same way you do with lifetimes.

I don't know if &inout Thing is the best syntax, but I do think that having the ability to connect the shared/unique nature of refs as well as the lifetime would feel natural to rust programmers and wouldn't be a major extra cognitive burden.

6 Likes

I don't think so (at least by default) because this impairs backtrace quality. The key question: which symbol name should be used if there's only one copy around?

But it'd effectively be limited to what &mut can do (it is not Copy) while at the same time being limited by what & can do (not mutate) within the body of the function. If this is just for these getters, what is really being gained by this? How often do these method pairs show up that they need a new keyword and API design guidelines? That kind of effect sounds like a high bar to me.

Not saying they justify new syntax, but they do show up allll the time.

3 Likes

Basically, it shows up whenever you're "projecting" a reference through an indirection type. Most, but not all, of the cases are covered by either Deref/DerefMut (indirection to a single value) or Index/IndexMut (indirection to a collection of values).

Where being generic over reference type could see actual meaningful reduction in code duplication is for things like MyFancyMap::get/MyFancyMap::get_mut. Unless you're writing collections, though, the amount of times you'll see pairs like this is minimal.

I end up writing an absurd amount of collections for Rust code, and I personally don't really feel the need for this.

2 Likes

Well they're already deduped by LLVM in rustc today: https://rust.godbolt.org/z/fa8e8Kb8K

4 Likes

Same as with generics: in debug builds the two possible mutabilities would be monomorphized into two different functions, with mangled names which include the concrete mutability. In release builds the different copies can be merged into a single symbol, which isn't something out of ordinary. It happens with inlining, and would happen with tail call optimization.

I have definitely wanted something like that feature, but imho the presentation in the article is too limited. For example, I would like full genericity over mutability, so that generic mutability could be chained through several method calls, and could be explicitly specified in the generic parameter list (which would deprecate the _mut suffixes, since if you explicitly need the mutable version of the method you can just pass an explicit mut generic parameter). Full genericity would also allow to have different generic mutability on different parameters, which is also occasionally useful. E.g. I could write a function

fn foo<T, ''m1: mut, ''m2: mut>(x: &''m1 T, y: &''m2 T, container: Container<T>) -> &min(''m1, ''m2) T { .. }

which would return a &mut T if both arguments are &mut T, and &T otherwise.

We could also write mut-generic structures, e.g.

struct Foo<'a, ''m: mut>(&'a ''m T);

For example, mutable and immutable iterators can often use essentially the same data structures and code, but currently must be implemented separately. This leads either to code duplication, or to complicated macros. With mut-generics we could have something like

struct Iter<'a, ''m: mut> {
    iterable: &'a ''m Iterable,
    pos: PosInIterable,
}

impl<'a, ''m: mut> Iterator for Iter<'a, ''m> {
    type Item = &'a m Foo;
    fn next(&mut self) -> Option<Self::Item> {
        if self.iterable.size > self.pos { self.pos.increment(); }
        self.iterable.get::<''m>(self.pos)
    }
}

impl<''m: mut> Index<''m, PosInIterable> for Iterable {
    type Output = Foo;
    fn index<''m>(&''m self, idx: PosInIterable) -> &''m Foo {
        self.get::<''m>(idx).expect("no such index") 
}

Similarly, Fn-traits should be mut-generic, so that one could write functions which accept any closure, regardless of its capture mode.

Other types which should really be mut-generic are reference-like types, e.g. Ref and RefMut guards on the RefCell. If my method passes through a reference-like type (e.g. Ref::map or something that uses it), then I often don't really care whether you give me a mutable reference, I'll just pass through whatever Ref(Mut) type you give me.

It is quite common that I write a function which doesn't really need either mutability of its reference type, or the ability to copy references. For example, builder-pattern functions may use interior mutability, so that they can accept and return either &Self or &mut Self. I may go with the fn foo(&self, x: Bar) -> &Self as the more general option (it can be used on both &Self and &mut Self variables), but then the user won't be able to chain that function with some other method fn baz(&mut self, x: Baz) -> &mut Self which requires a mutable reference. I can define foo as fn foo(&mut self, x: Bar) -> &mut Self, which allows nice chaining, but if the consumer wants to call that method on a reference-counted builder, then they are in trouble.

2 Likes

I'm really not sure about the usefulness of &inout, but it would give more guaranties. fn foo(&self) -> &inout T guaranties that foo() cannot modify the object, even if the returned reference is &mut. I also realized that when returning &inout then self must be an immutable ref, otherwise there is no reason to not return an &mut inconditionnaly.

And would it be possible/useful to have fn foo(inout self) -> inout T means that self can be taken by (mutable/immutable) reference or by value depending on the inferred returned type?

1 Like

Without commenting on the usefulness of the interface, I'm wondering how you would implement such a function usefully. The self argument isn't guaranteed to be an owned value, so any potential convenience or efficiency arising out of having ownership goes right into the waste – the compiler can't allow moving out of it, for example. At that point, it looks like it would be an easy-to-misuse interface that can introduce, for example, silent clones and related performance bugs (always cloning can mysteriously transform an O(N) algorithm into an O(N^2) one).

2 Likes

Without commenting either on the usefulness of the interface, the reason to have inout self possibly deducing to a value would be similar to the rational behind the deducing this proposal for C++23, especially the section on move or copy into parameters.

Say you wanted to provide a .sorted() method on a data structure. Such a method naturally wants to operate on a copy. Taking the parameter by value will cleanly and correctly move into the parameter if the original object is an rvalue without requiring templates.

struct my_vector : vector<int> {
 auto sorted(this my_vector self) -> my_vector {
   sort(self.begin(), self.end());
   return self;
 }
};
Is this useful?

The reason I’m not sure the rationals for C++ deducing this proposal apply to Rust are:

  • C++ is copy by default (with explicit move), where Rust has explicit call to .clone() and implicit move. Because of this, creating a copy before sorting is harder to in C++ (auto copy = my_vec; copy.sorted();), while it’s a simple my_vec.clone().sort() in Rust.
  • Move in C++ is non-destructive, which means that the caller will unconditionally call the destructor of a moved-from object (and thus a moved-from object must have an empty state if needed to be destructible), while Rust move are destructive (so the destructor of a moved-from object is never called, and thus can be left in an invalid state).

But it may be possible that you may want to implement a function similar to sorted() function on a wrapper type, which always return by value either by consuming the wrapper (Fn(Wrapper) -> Value) or by cloning what is needed inside the wrapper (Fn(&Wrapper) -> Value). This construction is only useful if sorted() when taking an immutable ref doesn’t need to clone the whole Wrapper object.

I also saw there was a working group for "polymorphization", which I think is targeted at problems like this?

Well, right now I'm working on a library with lots of access patterns and tree visiting (kind of like what @CAD97 mentions), and I have to write a lot of methods like find_child / find_child_mut that have virtually the same code except the mut version uses iter_mut instead of iter, so the code duplication is non-trivial.

I disagree. There's a lot of collection code where the _mut version of a method does the exact same thing, down to the machine code executed. Eg the tree visiting mentioned above.

Can you give an example of some of the code that could be improved by having parametric mutability?

Yeah, I did underestimate the "breaking habits" part of this proposal. A more modest proposal would be to leave existing methods as-is but use inout for new methods, but that comes with its own problems.

Not really the same thing, no. Polymorphization is about generating less code before passing it to LLVM.


I think I did a poor job of explaining why I'd want this feature. The goal wouldn't just be deduplication of code. The goal would be to express "this function works the same way internally whether it gets shared references or mutable references, but returns the same type you put in".

In that sense, it would be like lifetimes: a way to express the "logic" is generic over an effect, but the actual code is the same independently of parameters passed.

Except I was kind of trying to express this in a concise way, which probably doesn't work; it's like trying to implement lifetime elision before lifetimes.

4 Likes

I think this would be quite useful if structs can contain inout references:

struct MyIter<'a, I> {
    inner: &'a inout MyCollection<I>,
}

impl<'a, I> Iterator for MyIter<'a, I> {
    type Item = &'a inout I;

    ...
}

So you only need to implement 2 iterators (borrowed and owned) instead of 3.

The difficulty is that borrowed references are either shared or exclusive. A reference can't be shared and exclusive at the same time — it's a logical impossibility. So the compiler would have to figure out when exactly it was meant to have shared semantics, and when it has exclusive semantics.

1 Like

The key insight is that uniqueness is a strong guarantee whereas "sharedness" is more of a limitation as it prevents unsynchronized mutation. Exclusive references are strictly more powerful than shared references, which is evident since &mut T can be coerced to &T, but not the other way around.

Therefore it shouldn't be a problem to treat inout references like shared references. There is just one additional limitation: The borrow checker has to ensure that inout references don't alias, so the uniqueness is preserved. They can still be immutably borrowed though.

This essentially makes inout references the "least common denominator" of shared and exclusive references.

What can it do that a &mut reference can't?

2 Likes

Looking at this code sample from the post:

fn get_value(&self) -> &inout Value {
  self.value
}

Though I think it should be (&inout self) -> &inout Value.

It returns &mut Value only when self is mutably borrowed, otherwise it returns &Value. It's like get_value and get_value_mut in one.

Conceptually, the reference isn't shared and exclusive at the same time.

Rather, the "sharedness" of the reference is more like a generic parameter. The same function is instantiated for mutable and immutable refs. And like with lifetime parameters, the rules a made in such a way that shared and exclusive references use the same monomorphization.