Continuing the discussion from Docs.rs build prioritization:
Summary
A version of the leaky bucket algorithm for docs prioritization.
Motivation
Someone has been monopolizing the docs.rs build queue. The purpose of this pre-RFC is to come up with an approach to "evening out" a crate owner's use of the docs.rs infrastructure. They can still build as many crates as they want, but their builds are queued up in a "bucket" and dispensed into the main build queue at a fixed rate.
Guide-level explanation
In the docs.rs interface, you'll see a number associated with every crate publisher (which may be a user or an organization) counting their "build minutes." This number may be negative, but it will never go below -15 minutes (unless the user got a special exception). You are gifted one build minute per hour.
You need at least one build minute for any crate to start building; if you have zero or less, your crate will sit in the docs.rs build queue until you have at least one minute, allowing other crates to go first. If you have one build minute in your account, and you run a crate that takes 15 minutes to build, docs.rs will finish the build, but you'll have -14 minutes after it's done, and you'll have to wait 15 hours before your next crate starts building.
Requesting more build minutes would be done roughly the same way requesting a longer build time limit is done, but honestly, I don't think it would often make sense to give someone this. Who's going to be able to make a good case for being given permission to monopolize the docs.rs build queue?
Reference-level explanation
Each crate_owner row adds a column, let's call it "low_water_mark" of type DATETIME NULL DEFAULT NULL
. This is the "stored low_water_mark".
The "true low-water-mark" is calculated roughly like this.
// pseudo-Rust
fn true_low_water_mark(stored: Option<DateTime>) -> DateTime {
let minimum = now_utc() - Duration::minutes(30 * 60);
let stored = stored.unwrap_or(minimum);
if stored < minimum {
minimum
} else {
stored
}
}
When it's time for docs.rs to dequeue an item, it checks if (true_low_water_mark() - now_utc()) >= Duration::minutes(1)
. If it is, then it starts something else (or just waits, if nothing else is in the queue; because docs.rs doesn't cancel builds, starting someone's build when they don't have enough minutes "because there's nothing else to do" risks priority inversion if someone else comes in while it's running).
When a build finishes, the new stored low water mark is set equal to true_low_water_mark + (build.duration * 60)
.
The "build minutes" display is only used for display, to get something more intuitive than the timestamps that are stored in the database. It is calculated with (now_utc() - true_low_water_mark).minutes() / 60
. If the true low water mark is ahead of or equal to the current time, then you have negative (or zero) build minutes in your account, and cannot start a build. Otherwise, you have positive build minutes.
Drawbacks
This is not a simple rate-limiting scheme. It's not exactly the same as how crates.io does it, since crates.io is really rate limiting based on different criteria (we're using time, they're using crates), and since crates.io rejects traffic instead of queueing it.
Also, if you really want to, you can work around the rate limit by sockpuppeting. We might be able to catch people doing it, but do we really want to be stuck policing this sort of thing?
Rationale and alternatives
The choice of a factor of 60 is arbitrary, reasonable-sounding, and will probably be adjusted based on experience. So is the default build budget of 30 (it's enough to build two crates).
The specific design was chosen because it's an approach that is not new, is not exciting, and is probably "good enough". It has good theoretical properties; it's based on time, the unit we're actually trying to conserve, instead of using something like crates as a loose proxy for time. Even if someone gets a higher time limit (which docs.rs already does sometimes), their average resource consumption is controlled; crates that take longer to build cause the owner's budget to go further into the red, preventing them from monopolizing docs.rs's infrastructure.
Prior art
This design was mostly thought up based on the nginx "burst" rate limiter, with how it delays requests and has them actually get executed on a schedule.
This isn't really the same rate limiting algo that crates.io uses. Crates.io uses a token bucket, designed to allow bursts, while preventing bursts is the entire purpose of this docs.rs rate limiter.
Unresolved questions / Future possibilities
This technically allows a single crate owner to create a personal queue that grows forever, since crates.io allows people to publish new crates faster than docs.rs can process them. This is why nginx rate limiters tend to have "nodelay" turned on; it's not really all that good if latency is important to you, and while docs.rs's build queue is not designed for low latency, there does come a point when it gets ridiculous, and we need to start shedding load instead of just allowing the queue to grow forever.
Something should really be done about that. Since there are so many different things that consume the crates.io registry, it is not possible to make crates.io's rate limits suitable for all of them, but it probably would be helpful to at least have docs.rs warn when someone's personal queue gets too big, for potentially sending them an email (or banning them).