Rust expression order of evaluation

Therefore I propose making such expressions a hard error in the compiler – this rules out all nondeterminism and naturally leads to more readable code. Failing that, we should at least provide a lint.

Unfortunately, derefs are everywhere and are not required to be pure (but hopefully nobody relies on the order of evaluation of impure derefs).

Until someone shows me a useful expression that cannot be expressed in Rust without relying on the order of evaluation, I stand by my point.

It is my intent that every expression is equivalent to one that is independent of the order of evaluation. However, a sane order-of-evaluation is important for ergonomics (also, not breaking every non-trivial Rust crate in the world).

I don't know if it impacts soundness at all, but my intuitive expection is that all method calls would be evaluated in the same order, so if its possible they should be the same I think.

This can't impact soundness as borrow checking will be done after simplification. I would prefer to do it for by-value method calls too.

Ok, it appears I overstepped here. &mut a[i] += &a[j] should be a lifetime error, though. So if the code after Deref coercion fails to appease borrowck, it won't compile today, no matter the evaluation order.

That is indeed a lifetime error under my rules. &mut a[i] is an rvalue, so it is evaluated in one piece, before &a[j].

Thank you for pointing that out. I don't think it is too late, however, as evaluation order was deliberately unspecified, so code that depends on it is by definition erroneous.

The order of evaluation should be reasonable. I think "left to right, dereferences are evaluated as late as possible" is a reasonable rule.

It's not officially specified, but there exists an order in which the operations are taken to be performed during typeck, borrowck, and code generation (which seem to differ (!)). Lifetime analysis will reach a different result depending on the order of evaluation, so the order of evaluation needs to be specified.


@arielb1 Is there a reason that the taking a reference of the self parameter couldn't be delayed for methods as well? Consider this:

    struct Foo(i32);

    impl Foo {
        fn incr(&mut self, n: i32) { self.0 += n }
        fn double(&mut self) { self.incr(self.0) }
    }

There is a overlapping lifetime error in double() even though incr() can't mutate self until after the borrow of self.0 has ended. This is the kind of analysis that makes borrowck seem oppressive and unproductive to new users.

@td_

When non-commutative operations are done, the result of a program’s execution depends on the evaluation order. The lack of “upgrading” borrows does make this much more annoying issue, as mutable borrows do not commute with the common (pure and therefore commutative) immutably-borrowing query-like actions.

In fact, code like you presented is one of the primary motivations for the change. self is an lvalue. It has no rvalue components (rvalue base and/or indexes), so the entirety of its (trivial) evaluation will occur after the argument’s.

@petrochenkov

We will probably pick a variant of either the rvalue-first or receiver-last evaluation order. @eddyb does not like the receiver-last evaluation order as it makes complicated expressions evaluate in a rather unnatural order (*foo.pop() += foo.len();).

@arielb1 first, thanks for opening this thread. I’ve been meaning to follow up on this topic but it’s nice to have some of the details written up by somebody else. :wink:

My feeling is that we should keep the evaluation order LTR where possible, but overall it should be relatively easy to predict. Thus complex orderings that “mostly” preserve LTR seems (to me) to be overall less good than just saying “assignments are RTL”.

The interaction with overloaded operators is also an important question. It is not clear to me that self.balance -= x and self.balance = self.balance - x should always be equivalent. They are certainly distinct with overloaded operators, since one desugars into an &mut self operator (or at least that is the proposed desugaring). I think we should strive that the builtin operations can be modeled using overloaded versions, which would imply that self.balance -= foo() will “read” the current value of self.balance only after foo() is evaluated (unlike self.balance = self.balance - foo()). Subtle, but they you go. (I imagine that all languages which desugar -= specially can encounter a similar situation.)

Is the “define order in terms of method calls only” proposal (from @Diggsey) still in play? If not, what was the argument against it?

IIRC, Stability RFC proposed keeping packages on crates.io in desugared UFCS form with all type annotations. In this case, it would highly desirable for evaluation order to be the same for sugared operator form and desugared UFCS form ("method calls only").

1 Like

I did say the ‘this’ parameter would always be evaluated last - that would still hold for UFCS.

I'm not how that would still hold for UFCS? Are you envisioning that if you call Foo::foo and it is declared with &self that behaves differently than if it were declared self: &Foo? That is not really possible, because it'd have to be encoded in the fn type and so forth. I'm not really a fan of deviating from LTR in the case of method calls. We can make the borrow checker reason more precisely about method calls to solve the nested argument problem in its most common incarnation.

One could imagine saying that a.method(...) evaluates a last, and that might just mean that transforming to Trait::method(a, ...) form isn't always valid without more complex transformations to preserve execution order. But I find it very surprising that a.foo(..) wouldn't evaluate a first. I think it's because the receiver is so primary in method dispatch.

Assignments are rather different, both because things like vec[i] = vec[j] are pretty common and because the evaluation order is evident from the surface syntax, and there IS a certain logic to "evaluate the value you are going to store first, and then find the place you are going to store it into".

1 Like

As written, I agree with that instinct. Then again in a call like a.b.f(a.g()) (where a.g() takes &mut self) I'd be surprised that the inner function call can't be executed, because a is already borrowed. Which probably goes to say that at least I personally always expect inner to outer, as well as left to right, with inner to outer having a higher precedence (which is probably derived from math's brackets first rule)…

The interesting thing is more like a.get().f(a.g()) - should get be called before g?

When user-defined autoderef is involved, it is not exactly possible to just “reason more precisely about method calls” - as autoderef is allowed to e.g. look at the interior of RefCell-s without marking them as ref-ed (and that can be invalidated by &-borrowing code).

“upgradeable” references would solve this issue, but these require lots of work.

2 Likes

bump.

4 Likes

What’s the current state regarding order of evaluation? It is very tempting to write this:

struct Item {
    a: u32,
    b: u32,
}

impl Item {
    fn receive_word(&mut self) -> Result<u32, Error> {
        …
    }

    fn receive(&mut self) -> Result<Item, Error> {
        Ok(Item {
            a: self.receive_word()?,
            b: self.receive_word()?,
        })
    }
}

The expectation is that first the value a is received, then the value b. But with a non-determinate evaluation order, one has to introduce temporaries.

1 Like

Even with temporaries, it doesn’t look too bad in my opinion:

    fn receive(&mut self) -> Result<Item, Error> {
        let a = self.receive_word()?;
        let b = self.receive_word()?;
        Ok(Item { a, b })
    }

That code is correct and there is no chance it will change. In fact, I'm more or less of the opinion that the ship has sailed with respect to making changes to order of evaluation, period.

Nonetheless, the cases that were somewhat in question had to do with things like precisely when the index was computed in an expression like this:

x[x[i]] += x[i]

Here there are some slight inconsistencies between overloaded operators and non-overloaded ones and so forth. But if you're writing "readable code", you'll never notice.

UPDATE: To be clear, that example was from memory, I'd have to go lookup the tricky cases...

1 Like

More to the point, in a struct literal, the fields are evaluated in the order you write them; if a panic occurs before the struct is completely build, the intermediate values are dropped in the reverse order (once the struct is fully built, the fields are dropped in the order they are written in the struct declaration, iirc).

2 Likes

Hi, are there some docs somewhere on the currently existing evaluation order?

1 Like

Not that I know of. I would like to get more progress on a reference of this kind. Roughly speaking, the order is left-to-right, though in an assignment l = r, the expression l is evaluated second.

Filed https://github.com/rust-lang-nursery/reference/issues/248

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.