My thought wasn't to have popular crates always prioritized, just with all else being equal. Having a build-time-per-user restriction is probably the most accurate, though I'd still like to take popularity into account to an extent. Initial releases is also reasonable to prioritize.
Mainly I was looking to see if this idea is even sensible. There would obviously be a ton of discussion about how to prioritize things.
I'd also consider any release that is semver-incompatible with the previous latest release (if greater than it) to have some priority. Docs for 1.3.0 are "more important" than 1.2.3 since the 1.2.2 docs are (supposed to be) by definition sufficient at the API level.
As mentioned, it’s not an empty repository blocking stuff, instead it’s just one user with a bunch of (apparently related) crates where the source code does not (appear to) be posted in a repository. (At least there’s no repo linked on crates.io.
Take a look for crates starting with “surge…” on the first… few… pages of recently released crates as well as still in the queue. I count about 51 crates recently built and 42 still in the queue; there seem to be two releases “0.1.30-alpha.0” and “0.1.29-alpha.0”.
How'd he pull that off??? I thought that docs.rs had to pull from the repo to build it!
Got it, I misinterpreted what @jhpratt was saying. Is it a bad bot that went wild? Maybe prioritization should include how many crates a given user already has in the queue. Basically, the nice-ness level is the current number of crates you have in the queue, maxed out at 19. That should slow this behavior down to something reasonable.
That is unfavorable to owners who publish a large number of fast-building crates.
Why not track the total build time spent by each owner, and have a priority queue sorting owners by total-build-time-spent-so-far? That way, build time is equally distributed over owners, but owners who need less build time than average get visited the most promptly.
I probably would also include some factor of prioritization for popularity, because docs are a service to the users of the crate and not just the authors
Docs.rs already has support for deprioritizing large projects that could impact the build queue! The docs.rs team monitors the queue, and when we notice a big project causing disruptions we set the default priority for those crates to be -1.
The current way projects are added to the "deprioritized list" is manual: the docs.rs team gets alerts when the build queue gets too long, and if we determine the project is using too many resources we add the relevant rules to the list.
We're using this approach rather than an algorithm to automatically deprioritize to avoid false positives. Up until now the approach worked well, and the problem this time was the docs.rs team missing the alert. Unless this starts happening consistently I'm not sure this warrants a ton of design work around automation.
We're also planning some infrastructure upgrades to better handle the increase in the number of publishes: right now it's kinda hard to scale due to some architectural decisions, but we have a workable plan to be able to scale in the coming months.
Is it possible to get some real data from docs.rs (submission time, latency, build duration, etc.)? If so, this could be used as input to simulate how a proposed prioritization algorithm would behave and gather the same stats from such simulations to compare them. I feel like I've seen enough of these kinds of situations where actually running a simulation of the model tells you more than trying to compare abstract prose descriptions quantitatively (because of the second-order effects that tend to happen).
OK, I looked at the documentation for docs.rs, and it looks like each crate is built within its own docker image. So this actually gives us a really simple procedure for dealing with the issue that allows us to reweight the relative performance of each build on the fly:
The discussed monopolization isn't per crate (per docker container); that's already capped to 15 minutes (modulo special exemptions). The problem at hand is one publisher publishing multiple crates, the combination of which monopolizes the queue.
I'm aware of the issue, and I'm also aware of how my original understanding was flawed (see the issue on GitHub). I originally thought that there were multiple containers running concurrently, one for each crate being documented, but that isn't the case; there is currently one container running at a time.
That said, if and only if the docs team thinks it's worth the effort, we can combine approaches to solve the various issues:
All crates that are owned by the same entity are built within the same container. Within a given container crates can be built either concurrently or serially, it doesn't really matter which.
Containers are ephemeral; when the last build process completes, the container is disposed of in the same manner that they are disposed of currently (I don't know how that's currently handled, which is why I'm hand waving it away). That said, if a given owner is manually pushing crates (so that there is reasonably large period of time between publications to crates.io), and their associated container is still alive, docs.rs will push the build request to their currently living container and not spin up a new one.
Containers are executed concurrently, and have their relative CPU shares adjusted according the algorithm I gave above.
The trick is that if a given owner publishes multiple crates at once, and docs.rs is running a container for that owner already, then the build request is passed to the currently running container to process. Since the container isn't terminated until after the last build process completes, it will get progressively less CPU to execute, which will slow a given owner's crates down without affecting other owners. Once the container has executed all build processes, it can be allowed to die naturally. If the owner publishes another crate after their container shutdown, then a new container is spun up, with the default CPU share. That way no one is permanently punished, but if you are taking up resources (either because you have a lot of small crates, or one gargantuan one), you only affect your own crates, rather than everyone else's.
Note that the outlined method won't work as docs.rs is currently set up. Since it only executes a single container at a time, adjusting the relative CPU share is meaningless. So before this method could be applied, a great deal of work would need to be done to change the entire architecture. And as @pietroalbini has already said in the other thread, this is pretty much a non-issue for the docs team. In short, I'm 99.99% sure that we are now beating a dead horse
Tangentially, what about letting the option for some of the crates.io uploaders to offload the generation of the docs to Github Actions and its own "page"? I don't have precise ideas, but roughly, instead of the docs.rs-generated docs at docs.rs/that-crate/…/, we'd have a proxy page that would say:
1 perhaps just temporarily: docs.rs would nonetheless be allowed to, when idle, perform an actual generation of the docs and replace the redirection with that.
with the link warning about an external redirect, and having an option, through cookies, to register that you trust specified hosts so as to skip, on your local machine, that redirect warning and allow it to be performed automatically (potentially with a docs.rs banner at the top remininding that the site is external?)
In other words, I may be too naïve w.r.t. ways this idea could be exploited by evil users (although arbitrary js "injection" in docs.rs-generated docs is trivial, so I don't think that a warned external-link redirect would be that much worse); but the core idea is that people such as myself, and I imagine, many other rustaceans, wouldn't mind setting up a GH action that would render the docs somewhere, and then tell docs.rs that rendering their page is not that urgent or important / can be skipped.