Idea: Experimental attribute based support for partial borrows

Disjoint borrows exist within functions already, and I don’t think this has been a significant source of issues in itself. I haven’t heard recommendations to move unsafe code to separate methods or closures just to block it. Rather the opposite, I recall issues where a too-broad self loan was making semantics of unsafe code within the function dubious. For example, recently Arc::drop needed changes to language semantics to allow a &self reference to be a dangling pointer.

Please don’t forget that the borrowing limitation exists for API stability, not for safety. You can merge all your safe methods into one spaghetti megamethod, they’ll get disjoint loans, but they won’t be any less safe.

I think it’s worth having a hard look which features in Rust merely feel necessary, and where lack of restrictions feels uncomfortable, but doesn’t break things in practice. Partial loans still block mutable aliasing, prevent iterator invalidation, enforce immutability and so on. All of Rust’s safety guarantees hold within functions, so they hold with partial loans.

3 Likes

This is why I always end up coming back to some kind of macro fn for this. Because it's not just borrow splitting that I foresee people wanting for private helpers, but also move awareness, type inference, etc.

To me the simple way forward is to have that megamethod, just written in a nicer non-spaghetti way.

6 Likes

I meant that would the borrow checker detect the shared-mutable reference conflict.

I'm not sure why partial borrows must require any new syntax? Can't compiler store and use that information itself? Because partial borrows are quite useful and might be used a lot, but it would be extremely painful to write all of that of every time, especially when you need to change stuff frequently because you are only prototyping/exploring/trying out

The main argument against that (if allowed for public functions) would be that it would be semver breaking to add more things you use. But for crate or less public, that would be OK probably. Explicit vs implicit is the the question.

I think at least saying "this is a partial borrow API" when it is accessible from outside the crate is important. Having it be implicit everywhere is feels like having semver-breaking caltrops sprinkled around codebases.

1 Like

One trap you want to avoid is having a function body be todo!(), writing a bunch of other stuff, then going back to fill it in and finding out that everything else only worked because the compiler saw that you never used the borrow in the body at all, and let you do whatever you wanted.

I'm hopeful that this will be like return types. They're not inferred, but if you write -> _ you get a nice structured suggestion so you hit a button and the actual type gets put there.

The same can reasonably happen for partial borrows; you can start using a new part of the struct and the compiler says "well that's not in your borrow, but I can add it to your partial borrow for you" with a structured suggestion if that's what you wanted.

5 Likes

That seems reasonable. The main concern here seems like API stability. Accessing a new field would be a breaking change.

Ah, I think I have an interesting use case here: declaring a partial borrow without actually using the member. See this MR (though in particular this commit). Here, the highlighted commit, instead of making a new type, may have been able to utilize partial borrowing to declare that using the writer is a conceptual &mut borrow of data even though there is no method that uses both syntactically. The point is that even though data is not mentioned, it must be considered borrowed in order to avoid a deadlock because the receiving end of writer needs a write lock on the RwLock<data::Data> to actually consume items from the channel.

To me, this means that there must be a way to explicitly list members that are considered borrowed even if there is support for implicit computation because there may be safety considerations in how members are grouped into types (not that this example is one of Rust safety/soundness guarantees given that the deadlock behavior is Rust-safe, but someone can probably craft such an example).

3 Likes

I'm really fond of this macro fn feature. Since I'm sure there would be some details somewhere, I somehow feel that the best way to promote or refine it is implementing a rough version first. What's the chance that we start an implementation without a rigid RFC? Let's say I implemented this feature on a fork of rustc, how can I get people try it out or merge this in to the main branch (under a nightly feature flag of course)?

1 Like

To be honest, there are lots of subtle ways to make breaking changes in your APIs, to the point that I feel that relying on cargo-semver to help you catch those is a better avenue than trying to avoid adding features that our users can misuse to cause a break accidentally. In general we should make doing the right thing easier, but here it is a tension between making two separate things easier (one that users want to do, another that users should want to avoid), but for one we have no real answer today, and for the other we have a tool that will help.

1 Like

My old thread on macro fn is the most complete writeup of what they'd entail, I believe. There are various levels of transparency that can be argued for — my original proposal basically wanted to permit _ in the function signature with semantics similar to C++20 abbreviated function templates with auto, meaning field-precise borrowing still wouldn't happen — but I still hold that they're a strong kernel of an idea that I'd like to see developed.

A key bit of functionality of macro fn to me is that unlike macro_rules!, it should instantiate an actual function once per signature; that in every way other than signature transparency it acts like any other function item. I'm looking at it both from the lens of "less constrained function" and "more constrained macro," but imo it should be more like fn than macro since bang macros still exist and work well for what they do. macro fn should be more than just what a carefully written functionlike macro can do, not just the same but easier.

I think the first implementation step would be one of:

  • $:value and/or $:place fragment matchers for macros, which accept the same syntax as $:expr but evaluate the expression before any of the macro's expansion (with any temporary lifetimes treating the macro as a single unit), and the fragment expansion naming the result of that expression evaluation as a value or as a place, respectively.
  • The ability to express and check field-precise borrows across function boundaries in MIR, without any (or with placeholder) surface language syntax.

The former isn't directly useful for macro fn, but it is directly related in that it makes well behaved bang macros more straightforward to write, helping to address a decent portion of the need which is satisfied with each invocation being separately inlined and reanalyzed at each call site. $:value for imitating functions, and $:place to permit partial place usage, including field-precise borrowing. The latter is necessary in order to express what a macro fn is after type inference information from the caller is provided to fill in any type holes, if macro fn is to permit field-precise borrowing.

Although I must say I'm not the most fond of either f(&place) or f(&mut place) at the call site potentially being field-precise borrows. It's a lot easier to accept for method call syntax, extending the existing place autoref to include field-precise partial borrowing of the receiver. But I'm a lot less fond of f(place) potentially receiving place by reference instead of moving (copying) from it. Leave that behavior to syntactically noted bang macros.

This reminds me of a proposal from a good while back for generalizing type ascription in function signatures, allowing you to replace the somewhat repetitive fn f(Newtype(x): Newtype<i32>) with instead just fn f(Newtype(x: i32)), or even leaving type ascription out entirely if the patterns uniquely constrain the argument type.

Tbh I don't know how well it would interact with default binding modes, since Newtype(x: &i32) could be a pattern for type Newtype<&i32> or &Newtype<i32>. Or how the unique fn(&self) sugar that isn't a pattern like other function arguments but some special third thing. And yet another difficulty of course being that in a named struct pattern Widget { a: b }, a is the field name and b is the pattern.

All this to say I'm not against fn f(Widget { &id, .. }) as a syntax for field-precise argument capture in function signatures by analogy to the &self sugar, but I am at least somewhat wary of it. fn f(Widget { &id, .. }: Widget) is thankfully invalid, needing to be written as fn f(Widget { id: &id, .. }: Widget) instead, but fn f(Widget { ref id, .. }: Widget) is valid. (Which honestly doesn't quite vibe with my lazy mental model for binding modes that wants & to switch from the by-ref binding mode to by-move in the same way ref switches from by-move to by-ref, even though I know it doesn't actually work like that[1][2].)


  1. "Fun" fact! Given struct S { a: &'static i32 } and a pattern scrutinee of type &mut S, a pattern of S { a: x } binds x: &mut &i32, S { a: ref x } binds x: &&i32, S { x: mut a } binds x: &i32, and S { a: &x } binds x: i32. If you peel back the curtain, they actually bind ref mut x: &i32, ref x: &i32, mut x: &i32, and &x: &i32, respectively, so there is straightforward predictable logic driving this, but match ergonomics is certainly relying heavily on deref coercion and never mixing in explicit binding modes for it's "just works" target. ↩︎

  2. That a &_ pattern switches the default binding mode back to by-move (but still only is allowed when "destructuring" a reference) whereas named type patterns project the binding mode almost feels like a bug, and my mental model would be happier if &_ universally switched out of a by-ref binding mode if present before attempting to destructure a reference scrutinee. I.e. that default binding modes worked more as if the subpattern scrutinized a temporary value of type &mut Field, instead of still scrutinizing a place of type Field and automatically applying the ref mut mode, but only if no explicit binding mode is used. I realize actually creating a temporary field reference to scrutinize would impact otherwise unscrutinized fields, which shouldn't actually happen, and it shouldn't be possible to bind to the "temporary" by-ref either, but the synthetic field-precise reference scrutinee illustrates how my mental model wants default binding modes to behave. ↩︎

3 Likes

I feel like this is missing a nuance, if I read what you are proposing correctly. That is, you want implicit partial borrows, and since we are discussing semver at, you want this in public functions.

If that were the case, yes something like caego-semver-checks could catch it (if the info is included in the rust doc json file that it parses).

However, it also makes it much harder to make changes that are semver-preserving in general. Consider today, if you add a new private field to a struct (as long as it already has private fields or is non-exhaustive) and start using it from two existing functions, that is fine. This will no longer be the case if those two functions were partially borrowing different parts of the structure.

There are many other currently semver safe changes that would no longer be so, along similar lines. This reduces the design space for library authors. For this reason we need to make partial borrowing optional.

The question then is if it should be opt-in as suggested in this thread, or opt-out (along the lines of non_exhaustive). I would prefer opt-in, for less foot guns (non_exhaustive is easy to forget today, I would prefer it to be the default, but that can't be changed at this point).

I believe that we are all missing the point proposed in OP though. Let's make an intentionally anti-bikeshed experiment (using rustc attributes) so we can leave this sort of discussion to later once we know that the functionality works, and have some experience with the feature itself.

3 Likes

No, I'm stating that changing explicit borrows is a semver break, but that that shouldn't entirely preclude us from providing them. I very much lean towards explicitness over implicitness for Rust's design, at the cost of ergonomics, and to claw back some of it back through tooling.

I was one of the people on board with that, but if you have

struct Foo<'a> {
    a: &'a (),
    b: &'a (),
}

and want to write a method that only captures a and not b, then you are still on the same boat as today: their lifetimes are the same and the type syntax won't allow you to differentiate them, so you need a way to be explicit about the lifetime name on the pattern side. For simple cases you don't need that (because you'd write fn foo<'a>(Foo { a, .. }: Foo<'a>) -> &'a () { a }), but for more complex cases you would:

fn foo<'a>(Foo { &'a a, &b }: Foo<'_>) -> &'a () {
    do_something(b); // We capture foo.b only for the current function, but release it after completion
    a // field foo.a is captured beyond the current function, but foo.b can still be accessed after calling this function
}

I'm somewhat confused why we are discussing API stability when the proposal explicitly is not allowed in public APIs, to prevent having this exact discussion.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.