The Rust Project needs much better visibility into important metrics


#21

@brson:

I’ve started prototyping with ember and highcharts:

It ain’t pretty yet, but it’s a start. I’m hoping to have all of the core graphing in place sometime this weekend, time permitting.

Do you have a sense for how much data is being transferred in the scrapes? Do subsequent scrapes retransfer all the same data or are they incrementalized somehow?

The scrapes of GitHub and the nightly releases are incremental, and the incremental scrapes are pretty quick. GitHub provides a since keyword on most of their query strings, so I can just search the database for the most recent update, and ask for any updates since then. It works OK, although if the scraper daemon is killed in the middle of doing GitHub, then it’s necessary to run the bootstrap command to get all data since the last known good scrape (that command takes a date, so it’ll grab some extra, but that’s OK too – it’ll just update the existing data in place). The nightly releases are super simple to check, so each scrape just looks for a YYYY-MM-DD/index.html that occurred after the last successful release. The downside there is that if a release is yanked for some reason, if the scraper found the release before the yanking, it won’t update that, but that’s the cost of incremental updates I think, and probably not that big a deal.

The buildbot scraping is very much not ideal – it needs to go to each builder and ask for all builds on record, and unfortunately it doesn’t look like the current API supports anything more granular. This takes about 5-15 minutes per run. I’m currently running it once every ~90 minutes. @alexcrichton kept an eye on the build cluster when I first ran it a little while ago, and it didn’t seem like the API (despite taking forever) puts a lot of load on the build machines. But that may be worth revisiting. Also, to make things more interesting, the 0.9 series of buildbot (IIRC Rust is running 0.8.10) will remove the JSON API that I’m currently using and replace it with something much better. But that module will require a rewrite if/when the Rust CI system is updated past the 0.8.* series of buildbot.

It remains to be seen how incremental I can make the scrapes for the other data sources. I decided that the 3 I’ve got were enough to build the API server and web frontend, so I haven’t read up much yet on the other sources discussed.

How large is the summary data now?

If I’m reading curl’s output correctly, it looks like it’s about 66KB that’s currently transferred from the summary endpoint for 120 days of data. I’m in the process of reworking the database functions so that they return data in a format that’s directly graphable without transformation on the front-end, so that number could go up or down by a bit.

$ curl http://localhost:8080/summary > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65882  100 65882    0     0  1452k      0 --:--:-- --:--:-- --:--:-- 1462k

That’s so cool you are using Diesel!

It’s been a bit of a learning curve, but I think that it’s been the best example of Rust’s special sauce that I’ve used yet. Strong typing with clean abstractions and screaming performance.

The data you’ve collected already has been interesting and useful!

Glad to hear it!


#22

Thanks for the details. The example looks good.

This is great. I can tolerate a lot bigger transfers than that to get pretty graphs.


#23

A quick update! I haven’t touched this in almost a month because:

  1. Finals are a thing along with multiple deadlines at work.
  2. Suffered a bit of an accident which has kept me from coding for a couple of weeks while I recover.

I’m starting work on the project again today, and will hopefully post back with an update in the next few days.


#24

I posted this screenshot to IRC, and then I realized @brson wasn’t online, so including it here. This is the current state of the dashboard. AFAICT, the only metric not yet displayed but which is currently scraped is # of @bors retries per PR, and I still don’t know how to visualize that well. Anyways, here’s the current state of the dashboard (my portrait monitor isn’t tall enough to include the CI Failures section, but it’s below CI Times):

The next step is to move this off of my desktop to a source online. I know that it was previously discussed about having it on AWS somewhere, but in the interim I’m going to be hosting it on a Digital Ocean VM since it’s convenient and cheap. I’ll post back once it’s online.


#25

Holy cow that looks awesome @dikaiosune! Can’t wait to get to play around with it :slight_smile:


#26

OK, initial version is live at http://rusty-dash.com/ (ping @brson @alexcrichton). @ubsan suggested on IRC to make the grid of nightly release builds link to the relevant CI build pages, so I opened a GitHub issue.

Need to enable HTTPS soon, and then I’ll carve out some time to make the date range selectable after the initial page load. After that, I’ll start tackling other data sources and visualizations for them.


#27

That’s so cool! :slight_smile:

A couple of suggestions:

  • The two first sections (Nightly Releases & Issues) look ridiculously large compared to the rest. The blocks could be smaller, maybe have a calendar like the GitHub activity graph if you have enough data.
  • Is it possible to display legends for the graphs? I saw that they would appear on hover, but it would be nice if they could be shown by default.
  • The graphs could have a little more spacing in my opinion. I found margin-top: 40px looks better.
  • A countdown (in days) until next planned release with the current and next release numbers? I am not sure if it belongs on this page, but I think it is a nice bit of info to display.

Also since you are using a JavaScript Framework, I think it might be interesting to explore a design with “tabs” instead of displaying everything on one page where you have to scroll down. It could look a bit like what I used in this web app (the sidebar tabs).


#28

This looks fantastic @dikaiosune. Thank you so much.

The nightly releases grid is great. It looks like it may get the wrong value for the present day (since it hasn’t been built yet), but that’s not a big deal.

Some further ideas:

  • It would be great if one could expand the date range on the charts so we could see the trends up to a year in the past.
  • Some way to get direct links to the build failures for a specific day.

I’ll bring this up in various meetings this week to see what others think.


#29

@Azerupi:

I do have plenty of data for the nightlies (going back to 5/15/15). I would need to refactor the summary endpoint to accept multiple different date ranges. But that gets to another of your points…

I agree that a tabbed interface is where it should go. The current UI is what I believe those startup folks would call an MVP. I’m thinking it’ll be worth it to break it out into more pages once more data is added…but I do view it on a portrait monitor normally so maybe I’m biased.

I looked at the highcharts docs and didn’t immediately see a way to force the legend always on, but I will do some more digging. The Linux CI builders could be a bit of a mess though. Maybe the CI graphs should get broken down a little more by 32 vs 64bit?

A release countdown ain’t a bad idea. I could just manually plot out the 6 weeks interval in the database for the next 10 years. How are weekdays vs. weekends accounted for with the release cadence?

I’ll add some spacing to the graphs (and enlarge the titles, per feedback on IRC) tonight if all goes well.

@brson: The nightly release grid now has a green square for today. I can always up the interval at which that is refreshed, I think it’s like every 6 hours right now. That said, since there doesn’t appear to be a way to differentiate between “nightly build failed” and “no nightly yet” without going to the CI, I’m not sure what’s the best course of action.

Adjustable date range is on my radar – going to work on it after Let’s Encrypt. HTTPS all the things!

Do you mean build failures for a specific nightly release? Or all auto- build failures for a day? For the former, which builders are used for those? For the latter, that shouldn’t be a problem.

The more feedback the better! As a mostly-lurker of the Rust project, I’m sure others are better prepared than I am to assess what is or isn’t useful.


#30

Alternatively there could be a way to hide show certain graphs. This could go hand in hand with always on legends. For example if you click on one of the names in the legends it gets hidden / shown. However I doubt that this functionality is procured by the charting library. It would probably require a bit of custom work.

Yes, that’s what I thought of too. The only concern is if, for whatever reason, a release gets delayed or the interval between releases changes you would have to modify it manually.

Another idea that might be worth exploring is an online calendar you can subscribe to.
I am not sure if the Rust team has an official calendar people can follow? They could mark the releases on the calendar and allow people to subscribe to it. I am not sure how it works exactly but my university has the schedules available as a calendar that you can subscribe too. They provide a link to a .ics file and you can add it to Google Calendar for example. I suppose you could parse those files to extract important dates, like release dates.

It’s probably way more difficult to implement, but it has the advantage of not having to modify it manually if the release dates ever change. I am not sure it is worth going trough all this trouble, but I thought I would share it anyways :stuck_out_tongue: [quote=“dikaiosune, post:29, topic:3367”] and enlarge the titles, per feedback on IRC [/quote]

Personally, I would have reduced the text size to fit the text labels on the graphs :slight_smile:


#31

All the build failures for the day.


#32

hide show certain graphs

Absolutely. I’m assuming you mean individual series here within a graph though? Because tabs would hide/show whole graphs nicely I think. With Highcharts (could always swap out the plotting library) controlling individual series is very doable. Not very nice to do, but doable.

provide a link to a .ics file

@brson is there a publicly visible place (like a google calendar) where the release cadence is tracked?

reduced the text size

Sure, that’s doable too. I erred on the side of readability, and also the CSS starting point I used encourages larger text. Could definitely go smaller though.


#33

Yes that’s what I meant :slight_smile:


#34

@dikaiosune

Some more ideas based on recent conversations.

What are the prospects for expanding our data collection to rust-lang projects beyond rust-lang/rust? We put comparatively little effort into these repositories, so making it easier to see what’s going on in them at a glance could help us identify projects that need attention.

As an example we might create a page that graphs the open PRs in each project. Other metrics that could be helpful are open issues, untriaged issues, issues/PRs that haven’t been updated since X days.

Another thing I’m interested in is who is working on what. We’ve always been terrible at tracking resource allocation (knowing what people are and should be working on). If we had the existing information displayed in a more useful way I think we could potentially be better at it:

Imagine a page that displays information about all issues (across the entire org) that have assignments. It shows a list of all users (and their avatars) that have assignments, and next to that shows the linked issue number and description. That’s the whole thing.

From there you could get a strong sense of what the project is working on. Of course to start it would not be great because our data isn’t good and we don’t use assignments effectively. But once we have the visualization to monitor we’ll have a lot more motivation to use issue assignments. Somebody (I volunteer) could go through periodically and just review that all the assignments are still fresh. For moco employees this would be yet another attempt to track what we are working on - just seeing that everybody has some assigned task is :cool:. cc @aturon.

Another idea. There are lots of ways to configure github to display useful aggregations of information, but they are pretty hard to discover and remember. It would be great to have these cryptic formulas in one place, and linking them from our metrics site seems like it.


#35

We should also think about integrating http://perf.rust-lang.org/index.html, if not importing the data and re-visualizing it, then at least linking to it.


#36

Just wanted to link to:

It’s a suggestion to use bitrust’s data to display information about how frequently breaking changes occur.

What are the prospects for expanding our data collection to rust-lang projects beyond rust-lang/rust?

It’d require some refactoring and a DB migration or two, but certainly not a big deal. Which repos would be good to track? What about nursery repos?

open issues, untriaged issues, issues/PRs that haven’t been updated since X days

Like a triage dashboard, not just a metric dashboard? I can see that being very useful.

Imagine a page that displays information about all issues (across the entire org) that have assignments. It shows a list of all users (and their avatars) that have assignments, and next to that shows the linked issue number and description.

Another idea. There are lots of ways to configure github to display useful aggregations of information, but they are pretty hard to discover and remember. It would be great to have these cryptic formulas in one place, and linking them from our metrics site seems like it.

The average issue/PR age number link to pages like this, sorted by age (maybe should change to last activity?). The issue label counts link to pages like this. What other types of searchs do you see being useful? I’ll try to include them wherever I see them as I’m adding things, but if there are specific ones it isn’t hard to throw them on there.

http://perf.rust-lang.org/index.html


So, at current count there are 30 issues open on the repo, which doesn’t include a few deployment things like HTTPS and maybe eventually a better backup plan than digital ocean’s weekly plan. @brson, any interest in helping me triage the priority a bit? It’s not that much stuff, but my time is unfortunately a bit constrained. So, it might be a few weeks before I get traction on more than a few of these suggestions, which are the highest priority to work on right now?


#37

Good idea.

All repos in rust-lang, rust-lang-nursery and rust-lang-deprecated.

Yes, though I see it as metrics about triage.

I was particularly thinking of aggregations of PRs and issues across all org projects. Open PRs especially.[quote=“dikaiosune, post:36, topic:3367”] @brson, any interest in helping me triage the priority a bit? It’s not that much stuff, but my time is unfortunately a bit constrained. So, it might be a few weeks before I get traction on more than a few of these suggestions, which are the highest priority to work on right now? [/quote]

I will put it on my triage schedule. After using rustc-perf this morning I think it probably doesn’t make sense to try to re-visualize it, and more to just fix the UI problems it has. Still I think it should be linked from here.

I’ve done some solicitation for more ideas and have a bunch more important stuff I’ll write up soon. So excited.

I think we’re going to need to make this site multiple pages, and try really hard to make the front page contain just a few really important visualizations.


#38

Here’s another one that would make a huge impact


#39

And another:


#40

Just deployed an updated version to http://rusty-dash.com/. Nothing too exciting here, paying down a little debt, converting the UI to tabs to make new additions cleaner, and merging a PR from @chriskrycho (yay!).

Now that the main front-end blocker to adding new metrics has cleared, I need to take care of a few things on the backend (multiple repo support is needed for many of the new ideas), and hopefully I’ll be back to pushing features later this week.