Summary
Allow goto 'a
and goto 'a expr
to jump to a labeled block or block with arguments. Allow 'a: |args| { ... };
or 'a: { ... }
to declare a label that can be the target of a jump or jump with arguments.
Motivation
The intent of this feature is to allow control flow patterns that cannot be expressed directly in terms of break 'a value
, continue
, and loops. This is needed when writing code with irreducible control flow, which is common in state machines.
Here is an example where we rewrite a recursive dfs algorithm as an iterative algorithm with a stack. The jumps in the resulting code cannot be expressed in terms of loop and match without added overhead.
fn dfs_recursive(graph: &[&[usize]], mark: &mut [bool], i: usize) {
for &j in graph[i] {
if !mark[j] {
dfs_recursive(graph, mark, j)
}
}
mark[i] = true;
}
fn dfs_iterative(graph: &[&[usize]], mark: &mut [bool], mut i: usize) {
let mut stack = vec![];
'iter: |mut it: std::slice::Iter<'a, usize>| {
if let Some(&j) = it.next() {
if !mark[j] {
stack.push((i, it));
i = j;
goto 'start
} else {
goto 'iter it
}
} else {
goto 'done
}
};
'done: || {
mark[i] = true;
match stack.pop() {
None => return,
Some((j, it)) => {
i = j;
goto 'iter it
}
}
};
'start: {
goto 'iter graph[i].iter()
}
}
This is also useful for syntactic transfomations of other languages which can be expressed as a CFG but not necessarily with reducible control flow to Rust: MIR to Rust, C to Rust, LLVM to Rust. Even in the case of MIR, the printed syntax is very rust-like but has to invent syntax for goto because it doesn't exist in the surface language.
Detailed design
The syntax goto 'a
and goto 'a value
are new expressions, as well as 'a: block
and 'a: |(pat: expr),*| expr
(this builds on the feature for labeled blocks).
-
A labeled block executes immediately, while a labeled closure does nothing but can be used as a target for gotos.
-
Block and closure labels are function scoped, so any goto can target any label in the same function from the point of view of name resolution, but there are scoping restrictions on when such a jump is valid.
-
A
goto 'a
orgoto 'a expr
can target any labeled block, labeled loop or labeled closure provided that the scope of thegoto
expression (in the sense of MIRScope
) extends the scope of the label. That is, backward jumps are okay, as well as forward jumps, provided they do not jump over alet
binding which would put a variable in scope while skipping that variable's initialization.- Variables in the
goto
scope but not the label scope are dropped in reverse order after the goto expression is evaluated but before the label is entered, as with all other non-local control flow.
- Variables in the
-
The expression
goto 'a
targets labeled blocks,loop
,while
, and nullary closures;goto 'a (expr,*)
targets labeled closures only. (for
loops cannot be targeted because they have a loop counter that would violate the scope restriction.) -
Labeled closures act more like blocks than closures, in that
return
targets the containing function,?
jumps out to the function (or enclosingtry
block), and so on. (You can wrap the body of the closure in another closure if you want to redirect these operations.) -
Labeled closures implicitly have return type
()
(and the closure expression itself also has type()
); if control reaches the end of the block, the code after the closure is executed.
This is desugared to control flow on MIR, so in particular this is not an unsafe feature and does not circumvent the borrow checker.
Drawbacks and Alternatives
-
Gotos are considered harmful, after all. However, as they say, "this isn't your grandfather's goto". In particular, the ability to have a fully safe, type- and borrow-checked control flow construct means that this does not have the same dangers as its C namesake.
- That said, wow this construct has a lot of bad press. Perhaps it's worth finding a different name just to get out of the stigma. Alternate suggestions:
jump
,tail_call
,call_cc
, or reusingcontinue
(since the two functions can mostly coexist in the same keyword).
- That said, wow this construct has a lot of bad press. Perhaps it's worth finding a different name just to get out of the stigma. Alternate suggestions:
-
It is possible to get the semantic equivalent to this kind of goto using
loop { state = match state { ... } }
wherestate
is an enum containing all of the labels and their arguments. However:- Currently, this is not optimized very well. #80475 might be able to solve the issues here without language changes, but sometimes the optimizer just doesn't do what we need it to, and if this is a critical loop (like the core loop of an interpreter), then it is better to be explicit about these things.
- The direct goto version is better for borrow checking, which is implemented as a data flow pass and would be able to see through gotos but not state match loops like this. For example, the following code using the
match state
idiom is a borrow check error:
but the equivalent with goto is not:fn main() { let x = (String::new(), String::new()); enum State { A, B } let mut state = State::A; loop { state = match state { State::A => { drop(x.0); State::B } State::B => { drop(x.1); return } } } }
because it's equivalent to straight line code that just doesfn main() { let x = (String::new(), String::new()); 'a: { drop(x.0); goto 'b } 'b: || { drop(x.1); return } }
drop(x.0); drop(x.1);
.
Unresolved questions
-
goto
is not a reserved keyword, so there are the usual issues about keyword reservation, edition changes, etc. Maybe this isn't the final syntax anyway. -
There is a syntactic ambiguity regarding
goto 'a (expr,)
: is this passingexpr
or(expr,)
to the 1-arg closure? My suggestion would be to always treatgoto 'a (expr,*)
as being a list of arguments, andgoto 'a expr
where theexpr
is not a tuple only works with 1-arg closures, even if it has a tuple type. So'a: |a,b,c| ...; let args = (a,b,c); goto 'a args
would not be legal. -
Should labeled closures return
!
instead? The presently described behavior makes labeled nullary closures vs labeled blocks most similar, but perhaps it would be better to enforce that they don't reach the end of the body anyway to avoid surprises about where the control flow goes after that (it certainly can't return to after thegoto
like a regular function call because these are tail calls, they forget their caller).
Future possibilities
This does not cover computed goto, which require some extensions to the type system. (You can simulate them with an array of function pointers but this incurs function call overhead.) Unlike this proposal, computed goto does not have a natural translation to MIR, so would require changes at that level as well.
There may also be a market for unsafe goto similar to C's, but I think that any goto that does not match the safe patterns here will be quite difficult to make not always trivially UB. Plus the goto considered harmful crowd will probably pick that version apart before it gets very far.