Crates.io Discoverability and Engagement: starting the conversation


#1

crates.io should serve as a hub for community involvement.

After the discoverability portion of @aturon’s talk at RustConf 2016, my head has been swirling with ideas for improving crates.io. Imagine if the site…

  • helped guide you to the “best” choice for your goal — be it fast-track to MVP, easy use for beginning developers, or contributing to or otherwise improving community projects!
  • suggested community efforts that need or want help in some area, and
  • highlighted high-quality crates without marginalizing those in need of love (arguably in line with the core community pillar of inclusiveness)

all in an easy-to-use, browsable interface that encourages participation. I was impressed by RustConf’s speakers’ emphasis on community, and I’d like to see that extended to the (currently rather impersonal) process of crate selection.

Engaging the community prevents fragmentation and increases cooperation and productivity.

As a community (yes, there’s that word again), we have a mandate to not only guide users to the best choice for them, but also to direct focus wherever it would most benefit the ecosystem as a whole; we should probably discuss the correct course of action should these goals ever conflict.

How many times have you given up finding the perfect crate on crates.io and began writing your own, simply because it was too hard to find what you wanted or too difficult to determine which of the available crates were actually worth investigating? I’ve experienced this at least once in the short time I’ve been using Rust, and I doubt I’m alone in this. It probably encourages “Not Invented Here” syndrome, which most people should and do regard with contempt.

Fragmentation is such a concern that we may wish to set up domain-specific groups of people to curate and influence development of crates for a particular domain; I’ll defer to discussion for the details of this.

The proposed features aren’t just vacuous marketing speak: let’s brainstorm and collect data.

As we should be well-aware, “data-driven” doesn’t mean “unopinionated”; I propose that we now discuss and brainstorm the

  • features we want from such a community-focused reboot of crates.io, and
  • metrics that may be helpful in automating various aspects of describing a crate’s various attributes from “availability of documentation and third-party guides” to “openness to contributions” and anything else we might want to know about a crate and its micro-community

Please try to avoid bikeshedding in this thread; I’d rather do that later when it’s clear what everyone’s version of this proposal looks like.


Setting our vision for the 2017 cycle
#2
  • I think it’s also useful to list things that we don’t want to see in Crates.io. My first thought when I hear “community engagement” is (for some reason) “adding social features”. I’m not assuming that anybody is actually proposing this idea, but just to be clear, I wouldn’t like to see e.g. a rating system as implemented in e.g. Google Play or WordPress Plugins on Crates.io. Those examples in particular only “engage” commentators that are unlikely to participate in the community in a constructive way. Even with the freeform-textfield removed (leaving just the star-rating), I don’t think there’s actually anything to learn out of those other than to never look at comment sections. On a related note I would say that crate authors have the “right” to define their own communities, and that Crates.io should rather avoid interfering with that by forcing their own one on the author when all they want to use is the package index.

    I guess the reason why such things are implemented in the first place are to give stats about which crates are popular/“any good” and which are not. As far as I’m concerned the download count has done this job well enough for Python’s package index PyPI and Crates.io. What would be nice to see is to count multiple downloads from the same IP address as a single download, to account for automated builds. Breakdown by country, rust version…

  • What I’ve also found interesting are regression reports (example). For some reason we have the resources to just build almost all of Crates.io. I wonder if this data could be fed back into Crates.io, to determine which crates still build under the latest Rust version and which do not. On a related note I’d like to see in Crates.io which crates require nightly features and which don’t.

  • When I determine whether a project is still alive, I found GitHub’s “Pulse” stats to be very useful. I wonder if this data can be put onto Crates.io as well.

  • npm has pretty informative statistics (including some data from GitHub). I also find the week/month/year breakdown much more “useful” (for some definition of that word) than the graph Crates.io currently has. I suspect there’s more to steal from NPM too.

  • Reverse dependencies are not exposed in Crate.io’s UI. I would love to have a list of the most popular (most downloaded) reverse dependencies of a package.

  • https://libraries.io/, a package-index-index, has some useful search filters (search by license, keyword, …). Also shows reverse dependencies!


#3

ex.

Feature: automatically-curated list of links to crate resources

Overview

We could use the Referrer header on incoming requests to identify pages that link to a particular crate; some supervised-ML-based classifier could be developed to categorize each referrer and potentially add it as some source repository, API documentation, third-party overview, tutorial, in-depth guide, or other resource.

Implementation

This would require a way to handle aggregating Referrer on requests for particular crates, and a moderate to large amount of hand-labeled training data for training the classifier in addition to implementation of the classifier itself.


#4

There’s a link on a crate page under the “links” section that goes to “dependent crates” – now, they’re sorted alphabetically and not by downloads, but does this not count as exposed in the UI? If not, why not?


#5

I completely overlooked that, yes, it does count! I might not have expected it there since all other links in that section are external and author-provided.


#6

The problem here is that I’ve never heard of libraries.io until now – IMO it’s best if the official site is a strong contender rather than needing third-party websites to fill in the gaps. (To clarify, I think we should collaborate with these third-party sites rather than start from scratch.)