The Rust Project needs much better visibility into important metrics


Ahha, I missed that – looks awesome, well done!

Once we’ve gotten all the info on tabs that we want, then we can maybe work out what should go on the front page (as a kind of “overview”).


Once we’ve gotten all the info on tabs that we want

That could be a little while :slight_smile: – I’ve got 37 issues open in the repo, and ~18 of them will require adding new data sources or analysis in addition to the presentation work. Several more will just need new views from existing data. An initial guess at what might be most useful for teams to have on the front page could be a very good way to help me prioritize which data sources & analyses to prioritize (hint hint).

EDIT: And thanks!


I read through the issues and prioritized them per my own preferences.

These are high-priority and small:

  1. User-defined date range. Completing existing features.
  2. Link to day’s build failures. Ditto.
  3. Release cadence. Just an easy thing to do, and an easy way to answer a common question.

These are high priority and big:

  1. Assignees tab. Useful for project management and for users.
  2. Hot issues. Relatively easy to implement.
  3. Feature pipeline dashboard. This is probably what many of us want the most, but it’s quite involved.


Hi ! Sorry for the drive-by comment, but I’m fascinated by all the things in the metrics world. Most of the stuff I’ve seen is based on some combination of Grafana+timeseries db (like InfluxDB). Do you think this combo might be useful here ? Either from just looking at how Grafana implemented things to actually using Grafana as a frontend and possibly joining forces with them on improving the graphing capabilities if needed ?

Grafana allows for the specified group of users to edit the dashboards, this might allow for the load on the changes to be spread a bit between people. Not sure if all the other requirements mentioned are met (e.g. histograms seem to be in the works still)

Hope this helps more than disrupts your flow.


Sounds like a plan, and thanks for taking the time to go over the issues!

I’m getting some help from @chriskrycho on how to wire up the date pickers in Ember, should have that done soon. In the meantime:

  • Added a list to recent buildbot failures (scroll down, at the bottom of the Buildbots page). Definitely needs a nicer way to present it, but this is a start.
  • Fixed “deep” links in the nginx configuration
  • Deployed the “Triage” tab with the GitHub search links, will expand this with new metrics soon
  • Added benchmarks game link to useful links page.
  • Merged a PR from @chriskrycho to start the process of putting in the tests which I’ve very negligently not-built while putting this together.


Grafana does look super slick! I will have to take a closer look, so thank you for sharing.

I see three main issues with adopting something like Grafana:

  1. Their SQL support doesn’t seem to be done yet (but I may be wrong here). Postgres performance, deployment, backup, etc. are just so stable, well understood, and supported on cloud platforms like AWS, and SQL is a second tongue for any data junkie. Having to move away from an RDBMS (basically Postgres) would be a very tough sell in my opinion, not just because of familiarity, but also because the data we’re collecting isn’t purely time series.
  2. It turns out that throwing some graphs up is the easy part of this project! I do think that having community-editable visualizations would be awesome (and I have an open pipe-dream issue to build something rudimentary along those lines). But, as you can see in this thread, the most dire needs are for visualizations/reports which combine time series data with non-time-series relational analysis (see @brson’s issue about a “hot” issues tab or a feature pipeline tracker). Which really makes the work here spill outside the traditional definition of “dashboard metrics.”
  3. Rust is really quite good at a lot of this work (a bit verbose for some, yes), and building something in Rust means that a) it can more easily get contributors from its own community, and b) we have yet another showcase of an area where Rust can excel. Both of which are great value-adds for a project like this on top of the useful information it can provide.

All that said, I’m not ideologically opposed to Grafana or any other off the shelf tool for that matter, I just haven’t seen one which meets all of the needs here.


Indeed, the focus is on non-rdbms, timeseried databases.

Cool :)[quote=“dikaiosune, post:48, topic:3367”] Rust is really quite good at a lot of this work (a bit verbose for some, yes), and building something in Rust means that a) it can more easily get contributors from its own community, and b) we have yet another showcase of an area where Rust can excel. [/quote]

I wholeheartedly agree.

Thanks a lot !


From IRC, regarding the build failure list:

“exception interrupted” is how buildbot stops everyone else when one fails, so they should never be listed


Thanks for posting this here so I didn’t forget. Just deployed a new version which ignores failed builds with messages matching LIKE "%xception%nterrupted". The tweaked version is live (ping @brson).


Links to build failures (for nightlies on that tab and for auto builders on the buildbots tab) are done, as is a simple release cadence tracker on the landing page. Fresh code deployed to @chriskrycho has promised to help me grok Ember a bit better to properly implement user-defined date ranges.

The scraper will take some time to grab all the data from the nightly builders, so that tab should be accurate by the morning.


Oh man I love seeing new progress on the dashboard.

The nightly failure links are perfect.

The failed build metrics on the buildbot page seem suspect. Only 4 in 24 hours? The graphs (particularly linux) look like there may be some gaps in the data.


I’m pretty sure that 4 overall fails isn’t too unreasonable, there have been ~7 PRs merged in the last 24 hours, so it seems OKish to me that there are only 4 root build failures (excluding any that are cancelled before they finish because a failed build elsewhere in the cluster).

@retep998 brought up on IRC extending the window to 48 hours, what do you think about that?


I had 3 rustbuild failures, and each of them took about 2 hours.


OK by me.


Looking at the buildbot failures just made me realize that the commit links for the cargo builders still lead to the rust-lang/rust repo which is definitely wrong for cargo.


And your comment made me realize that I need to either update the graphs to include non-auto builders, or I need to update the title for the charts. Thoughts, @brson?


The other builders aren’t as important. It might be good to know the time it takes to build nightlies though.


Sounds good.


Update (doing some more work tonight, but this is where the dashboard is right now):

  • All rust-lang, rust-lang-nursery, and rust-lang-deprecated repositories are scraped for issues, PRs, and comments. At the moment, all of the GitHub-derived charts on the dashboard now show items from all repositories. This was the main obstacle to implementing some of the fancier features suggested.
  • Nightly builder times are shown on a graph on the Nightlies tab.
  • Recent “@bors retries” (last 7 days) are listed on the buildbots page.
  • Switched CI failures to a scatter plot to better display builders which have several days in between build failures (HighCharts was inferring a line between the various data points)

@brson, to start working on the “nag” tab(s), I was wondering if you and/or other team members could generate some example issues/PRs on a couple of repos with the appropriate labels, comment contents, etc? That way I can properly track that the feature is working through scraping -> persistence -> query -> display.


Yes, I can generate some examples.