Rust 2019: Address the "Big" Problem

Tame Complexity through Community Involvement Tools

Edit: I'd just like to say that I think 2018 has been a great year for rust. Rereading this, I realize may have come off as a little negative. End edit.

Rust has a "big" problem. Our feedback loops have gotten big because we have gotten big (as a community). We've started to council together at the end of the year. I don't think it's bad, but part of it is reaching consensus on the current state of Rust. I think it's good actually, but we need more things like it - I'm not suggesting we call for even more blog posts, but that we start tailoring activities to address the "big" problem.

The "big" problem is growth (and increased community size) - it's a good problem to have, but it does cause issues.

Rust has a lot of gaps caused by its explosive growth, more prominent ones have already been covered:

These are all artifacts of growth. It's gotten harder to hold everything about Rust in your head. The tools the Rust community has been using (mostly github provided ones) were good at small sizes, but don't scale well. They're leaky abstractions. Pros and Cons analysis with consensus represented by a linear discussion, state scattered across the web, etc. Harder to follow, harder to moderate, harder to contribute to.

It's harder to:

  • track things you care about (I see this as a large contributing factor in the impl Trait return position & website controversies)
  • see the overall status of rust
  • see where things need help

Proposal

We should build something to fix that.

We should build Rust specific infrastructure to manage the complexity that comes community sprawl (the sprawl is not bad). Organization specific infra is a thing (I'm looking at you task-cluster that replaced jenkins)

What will it look like? I'm not sure, but I suspect each team will need something different - form following domain.

I have a couple of ideas after reading some other Rust2019 posts of what said central, easy to find tooling should include:

RFC Stuff

We should probably get an RFC tracker. A real tracker, not github issues (Can you imagine if https://caniuse.com/ used Github issues/prs for its UI? ). Lets track by time and state, category and syntax impact (and more?!). There is so much metadata about RFCs that would be useful if it was understandable by computers. Lay people could get a better idea of what's going on, and contributors could see what's happening at a glance.

In addition to the overview/tracking tool, we need a better tool for contributors of RFCs to use. Discussion is the big one. Can we build a tool for structured discussions - pros/cons, links to other RFCs, view changes overtime and hidden votes (because while things aren't a popularity, it's probably still a good idea to let people express themselves, even if only so the moderators / leaders can see). There are also proposals for limiting throughput, a more distinct staged process, needing a team champion, etc. We could integrate that too.

Docs

I heard the docs need love. So it'd be great to have a list of things that need doing. Sections to proofread, sections that need feedback (e.g. I'm trying to accomplish x, but it's not working / coming off right), things to write (cookbook), brainstorming structure/approach ideas, etc.

It'd also be nice to a have one big index to rule them all, about the various docs, what's next on the doc teams plate (we're hoping to improve x, other big initiatives, etc), etc.

Misc

The embededded working group could probably use a page listing all the platforms, and what their status is, along with a list of drivers. It actually probably exists. Somewhere.

There are probably other things for other teams that would be great to put into this one, mythical centralized location. An improved team index would be nice (e.g. direct links to the discord / other place of preferred chat, links to the team specific areas, etc.)

If we were to open infrastructure support to contributors (I have a perfectly good machine laying around I'd be willing to run a CI runner on 24/7), we'd probably want a public dashboard of some sort listing current capacity and a way to volunteer your computer. Barring that, we'd probably want a graph of average wait time on compiler PRs (which is measured in days!).

There is surely a lot more than this to what we'd want (I think) - I'm not really familiar enough to say though. Heck half these things are from what I've read in the 2018 posts, if not more.

In Closing

The rust community has been great. In addition to the great language, the people are great, and trying hard things - transparent consensus (well, taking in put into account)-based decision making. It's been harder to do that at scale (it either exacts a big toll or breaks down). I think we should fix that.

This is a call to trade generic tools for hand-crafted tools. To trade in a little team personality (each team does stuff differently), for less friction (and more accessible to new people).

Custom tooling is a big effort. But I think it's the best way a growing community long term without giving up the great things that characterize Rust's governance and growth.

16 Likes

FYI a graph like this is available in https://rust-lang-nursery.github.io/rustc-pr-tracking/. You could find this link from forge.rust-lang.org.

3 Likes

Those are nifty graphs. I’ll take those :slight_smile:

I think that in terms of gathering support for expanding infrastructure, a better graph would be of the average wait time of things that are S-waiting-on-bors (and perhaps S-waiting-on-crater). That would really drive home the infrastructure woes. The other graphs are useful as well though because they highlight the people component (review/author/team/bikeshed). I wouldn’t want to lose any of the current graphs.

As someone who’s built a few pieces of custom tooling for Rust including rfcbot and a now-defunct dashboard with many of the metrics you’re describing, IMO the bottleneck on these ideas is availability and prioritization of effort/time/labor/work/etc. This kind of work doesn’t typically carry lots of prestige (one of the main motivators for contributions), often requires lots of information gathering if you’re not building tooling for your own processes, and is often hard to prioritize against other “direct impact” work. To build things like those described, we either need to find people already in the community with the right desires/skills/experience/etc, attract people to the community who would like to contribute to projects like these, or fund their design and construction via someone’s full-time work.

The first option seems to be hard because existing contributors are often furiously working away on the topics which originally captured their attention. I can see how some existing proposals to “do less” might free individuals up for some meta-work, most proposals I’ve seen have other proposals for priorities that those contributors might adopt to pay down organizational debt within each of the teams. There have been a few attempts in the past to rally contributions around these kinds of custom infra applications (the bors2 conversations in the community team a little while ago springs to mind) and I at least am not aware of any of them taking off. A new effort in that direction would, I hope, address typical pitfalls for coordinating that kind of work.

Aside: which team would lead the design of an RFC tracker? Core? Infra? A new one? Questions about the governance of tools like the ones proposed have already come up for rfcbot and figuring out some answer for their home seems like a good idea.

The second option to get people working on these is an interesting one but still reduces to a pretty similar problem: who among the existing community will lead the design and implementation of the new tools? What tools will they have to succeed?

In terms of full-time funding, I’m not sure who would fund this work – Mozilla? They seem a bit capped out on the Rust initiatives they’re funding (although I don’t have any special info there). Some large tech companies have expressed interest in funding work on Rust, but AFAICT they’re mostly focused on specific things they need to help it succeed in their environments. Perhaps there’s another angle to play there?

EDIT: I worry that I enumerated a bunch of obstacles without expressing enough enthusiasm – this is a great set of ideas and some combination of them should definitely happen! We’ve talked about trying to make rfcbot into various forms of this for a little while (@nikomatsakis and I talked about a more general-purpose RFC tracker and workflow organizer at rustconf…2016?) and have been mostly blocked on my time and a slow trickle of contributions. Arguably rfcbot could have been the wrong choice for where to build this (the code isn’t exactly polished lol), arguably my lack of time has meant that mentoring new contributors hasn’t happened when it could have, etc. Not saying that any of this is impossible but that the lack of these tools today seems to me more due to the underlying question of Who? than whether they’d be desirable.

8 Likes

I think we’ll end up with a couple of different groups of contributors in this:

  • Disgruntled people who have a hard time following things, who want to see it better
  • People who finish whatever they’ve been enthusiastically doing (either completing or burning out/turning over the reigns)
  • People who love stats & transparency :smiley: … maybe.

I’d think at the very least we’d want a working group for this, probably with people from each team at least for coordination. Maybe a proper team? It seems like it’d be the in the purview of both Core and Infra, and yet maybe a big enough undertaking to be separate.

I didn’t even think about funding, but that’s a really good point - it will take a lot of bandwidth. I do think that the easiest thing to secure funding for would be for more infrastructure (if we have nice graphs and projections showing potential impact of contributions on turn-around time).

It’s funny that this is meant to be a way to help solve coordination issues, and the biggest threat to its success is probably coordination issues.

I think the following will be pivotal to getting things off the ground and successful:

  • getting a small core estabilished - sort of like how tower-web and warp were worked on for a bit before being released. It’s important to get a core established to have something to fit the pieces of the full implementation into. And we don’t want a bunch of bike-shedding to prevent the groundwork from happening.
  • Having a coordinator. Once we get far enough, I imagine that there will be lots of little things that could be worked on in parallel. Maybe not though.
  • Emphasis on a couple of highly visible things for the core - PR wait-time and RFC tracking perhaps? Something that’s not too hard, and yet is would be liked by a lot of people. I think average turn-around time and RFC tracking brings a lot of simple to see transparency and benefit
  • Official backing - namely a place to live (both domain and server wise), and guidance in terms of extracting / integrating with the current crop of rust … stuff. Data sources and points of interaction.

A large part of me thinks we’d do well to make it modular so other things could reuse it as necessary. Sort of a code level modular Drupal / CMS thing? (disclaimer: I’ve never touched PHP/Drupal). In that linked thread you can see that the RFCBot is being used outside of Rust.

Anyway, I think these are good points that need to be brought up and figured out - thanks for sharing them. If I had to choose between hearing about obstacles and hearing only enthusiasm, I’d choose to hear the obstacles any day :slight_smile:.

Hopefully putting an emphasis and discussion such tooling will help get the “who” question answered.

1 Like

Is forge linked to from the main site? Writing from mobile.

Regarding RFCs, I’d like to see a collaborative mind-mapping tool where people could “upvote” branches to make them appear stronger, and hide (or, for mod team members, even remove) them. This could be a great help to map out the design space.

Regarding docs, I held the docs workshop at the RustFest 2019 and I think doing more of such workshops off- and online would be a good start to improve matters.

4 Likes

I really like the idea of rust specific infrastructure, mostly because i was thinking about it too :stuck_out_tongue:

I’d volunteer time for that and i actually view it as high impact work, at least in the long run.

I also prefer starting with a very small scope and slowly build features out from there, feeling out what is really needed and accepted by the rust community while were going.

But without fulltime work, this is a really big project. It’d be a big project for a fulltime team…

Exactly. Is it completely naïve to think about pooling resources with other languages such as Swift? After all, the problems being solved are very similar, both in the domain and in the social dynamics. I just don’t know if the required coordination wouldn’t outweight the advantages.

1 Like

I think git is a suitable tool for tracking PR discussion.

@Soni Sorry to be pedantic, but, do you mean:

  • Using github PRs directly, one per RFC?
  • Using git repos, one per RFC (suggested in earlier proposals), with multiple PRs to manage various stages?
  • Using git repos, storing the discussion directly in them?

I don’t think Github PRs scale well. The linear discussion model breaks down easily when the PR gets lots of traffic, and it’s harder to track things programatically.

You can adapt the track-ability with stuff like the RFC bot being invoked that maintains it’s own machine readable records (well, in a db that’s searchable somewhere better than the GitHub issue search. That’s still a little brittle in my opinion because of the opportunity for operator error. I think the staged rfc proposal (already linked in my initial post) outlines some good ideas in this regard.

The discussion nut is little harder to crack. I do like @llogiq’s suggestion of mind-mapping. I think we still need a discussion area, but a nice mind-map would probably go a long ways to calming the discussion stream, especially if we can take comments after the fact and pin them to parts of the mind-map (e.g. for people who come in, don’t read the full thing but want to make a point known - and I think that input is still valuable because sometimes it’s people who feel like they are at the margin, and want to make some lesser point known). There is another proposal for rate-limiting RFCs which is an approach I don’t like as much for tackling this, but it makes some good points (Table of Contents) and would probably be workable, but sub-par in my opinion.

I’m not opposed to keeping Git as a storage mechanism, as it does have great history tools.

The biggest draw back of course for custom infra would be the cost. And second would be learning a custom UI.

None of the above. Instead, all of these:

  1. One git repo for ALL RFCs.
  2. One branch per RFC.
  3. Forks of branches to add comments. Comments can be merged back upstream and git has tools to browse a specific uh, thingy of commits. (e.g. merging commits A and B from different branches requires a merge commit C. ignoring the merge commit, git lets you pick one of those branches to view the commit sequence within it.)
  4. This means you just branch a branch to comment on an RFC, and at the end it all gets merged back into the RFC’s history. More importantly, it’s possible to attach RFC changes to specific discussions, something we can’t do today. (except through linking to a comment, but that has issues like being hard to follow.)

Git does everything we want, including the ability to branch off discussions and create arbitrary new discussion bases on the same repo. It’d be kinda like letting internals (this forum) users create arbitrary categories and subcategories, I think. But it directly integrates with the RFC system and history.

I do believe we need better tools to discuss RFCs and make decisions.

Right now, RFCs often get so many comments that GitHub collapses several hundred of them. However, RFCs are discussed on other forums, too. This means that nobody has time to read all the discussions on all forums.

I’m proposing a tool to file RFCs, discuss them, and reach a consensus. This tool could also be used for everything else that requires a decision by the community. In this tool, an RFC consists of three parts: The wiki, the feelings section, and the discussion. The wiki describes the RFC in detail and is editable by anyone.

In the feelings section, users can express their feelings (e.g. “this is confusing” or “this adds unnecessary complexity” or “nobody needs this” or “this would help me a lot” or “I prefer the syntax with the question mark”). Feelings consist of one sentence and can be voted on (totally agree / somewhat agree / somewhat disagree / totally disagree). The feelings are then sorted by how many users agree with them. They basically act as a survey that points out the most frequent concerns.

The comments section should be a tree (reddit-like) to allow structuring long discussions.

I’d like to know what you think about this, and what else you think is important. I’d volunteer to create this tool (but hopefully with some input and help).

1 Like

The comments section should be a DAG (git-like).

@Soni, while the comments being attached to changes is cool, I don’t see how it addresses the following issues (which I think are the most pressing issues):

  • Comments being disorganized (the same point made repeatedly)
  • Comments being widely-scattered (in multiple locations)

It does raise the bar for adding comments, which maybe reduces both issues… but I don’t think curtailing feedback is a good way to manage community involvement.

I do think it works great for when you have a small number of contributors for any given area of a repository, but the issue that we’re suffering from (but it is a good issue to have!) is the large number of participants in individual RFCs.

I find myself firmly in the camp that a centralized solution is the best for for managing RFCs - perhaps you could explain how the RFC process would benefit from the decentralized DAG format you’re suggesting?


@Aloso, I think those are great ideas. It’d probably be cool to be able to branch discussions off of the feelings points. We probably want some way to let mods modify the discussion (with open history - the point of the modifications wouldn’t be to censor, but to organize). My specific inspiration for this is that in Zulip (chat program), all messages in room need a topic, and if you forget, another person change change the topic for said message so it ends up in the right thread.

Of course we could do something similar. I like the idea of adding tags to comments. Then someone can view all comments with a specific tag. With a tree-like discussion, we could also encourage people to start one top-level discussion for each topic, so it's structured automatically.

True, true. I was mostly thinking for people who come and make a comment without reading all the threads. Of course if we do have threaded discussion, that’s a lot less likely.

I would like to link post with my draft ideas regarding service for improving RFC process:

Of course it will be hard to build such service from ground-up, so I do not propose to start building it right now, but think about it as a general direction in which I think we probably should move.

It literally attaches parents and children to comments, and you can have multiple branches going on at once. How does that not solve the disorganization problem? (each branch is a separate topic, sort of. you can even merge topics or address many of them at once!)

As for comments being widely-scattered, GitHub can actually display these quite well, if with a few filtering issues. (click the number beside “fork” on a repo. the UI is mostly already there, so they have very little work to do on their end to make the experience better for us.)

@Soni How are the parent-child comments stored - as empty commits and it’s the message? In a text file? In a different text format (json, yaml, maybe folders?) to handle nesting? Anything flat and you’ll probably have tons of merge conflicts.

I consider the current GitHub UI for managing forks, as it relates to using it to explore and get a good overview of branched changes in this scenario, pretty atrocious, not matter how little work it would take for Github to cater to our needs (which I think would be a lot, and even if it was a little, we should base our workflow on a potential change another organization hasn’t yet made).

Your approach comes off (to me) “this is a good idea, and so we should make it work”. I’d like to approach the issue from more of a “what do we want, and what’s the best way to implement it?”. If we can make a way to make it fit into Git after the fact, after we’ve gathered all the “requirements” (quotes because it’s more like aspirations than requirements, and I think not everything will make the cut), then I’m all for it. I do worry that by committing (heh, I love puns) to Git early that we’ll leave features off the table that could be useful. I think that the very least, in order to use Git we’d have to roll our own UI as a frontend for it, so the format stays standardized.