I create some simple code who accidentally stuck rust playground and no timeout termination for this:
use tokio::sync::Mutex;
#[tokio::main]
async fn main() {
let mutex = Mutex::new(0);
tracing::info!("before");
let _lock = mutex.lock().await;
tracing::info!("after first lock");
let _lock2 = mutex.lock().await;
tracing::info!("after second lock");
}
This is a deadlock, because _lock is not released until the end of scope — the next }. You're trying to lock mutex twice at the same time, and Mutex never allows that.
This code runs like this:
#[tokio::main]
async fn main() {
let mutex = Mutex::new(0);
let _lock = mutex.lock().await;
let _lock2 = mutex.lock().await; // waits for _lock to drop
drop(_lock2); // gets destroyed and unlocks the mutex
drop(_lock); // gets destroyed and unlocks the mutex
}
If a deadlock is bypassing the timeout, perhaps it's a difference between wall time and process time? I.e. the timeout only counts when the process is using compute (to avoid penalizing jobs that just get unlucky with the scheduler or are waiting in the job queue), but this results in the timeout never timing out if the job deadlocks and never gets scheduled.
Also note that for a little while now, the playground does support streaming the output (e.g. here’s a countdown), and live interactions (e.g. this is the guessing game from the book – make sure to enter your guesses with the input field at the very bottom of the execution tab; and FYI, there’s also a “Kill process” option in the ⋮-menu in the corner).
So timing out based on wall time would be “wrong” if the program might be waiting around for a while on purpose, or simply be waiting on additional user input.
I’m not familiar with the implementation of this at all, but I would assume it keeps communicating with the client and thus can eventually kill the process when the user leaves or refreshes the website.
We actually don't do much right now with process time. It will be reported to the UI which will present a dialog box encouraging the user to kill the process, but nothing is enforced for this server-side (beyond the 45 minute timer above).
It should not — this screenshot is from me running that exact code. Oh, right. I forgot that if you don't interact with the dialog, the frontend assumes you walked away and forgot so it sends the kill command (which the dialog even says... too bad I can't read).
In addition to the above, old-style REST requests have a 10 second timeout.