Follow up: the Rust Platform

My post earlier this week generated a lot of feedback, much of it negative. I really appreciate the detailed thoughts that people gave, which drew out many drawbacks and outlined some simpler alternatives. I want to summarize some of that feedback, then revisit what I see as the core goals. I’ll catalog some of the alternative ideas that people have proposed, as well as a couple I’ve been mulling over.

First, to recap the original proposal: the main thrust was to develop a kind of “extended standard library” made up of curated crates from the ecosystem. The platform would be available as a single “Cargo metapackage” snapshotting a collection of compatible versions, enabling integration testing and documentation cross-referencing. The hope was to provide the experience of a large standard library, without the pitfalls that tend to accompany it.

The feedback

Here are some of the concerns people raised:

  • Let’s start with lower-hanging fruit. A common refrain from the Rust community was: we love the combination of Rust’s minimal core, plus incredibly easy access to the ecosystem through Cargo. If discoverability, integration, or quality are issues, let’s address them by making our existing tools better, e.g. by improving search or ranking within crates.io.

  • Picking winners has significant downsides. The platform would introduce a sharp cliff between the crates it includes and the rest of the ecosystem. That introduces several risks. First, in precisely the cases where it’s most valuable – where there is no clear “winner” in an area – it risks ratcheting up the pressure and contention by trying to centrally decide on one. Second, it reinforces first-mover advantage, which could lead to the same kind of stagnation one sees in large standard libraries. Finally, it’s time consuming, and that time might be better spent working to improve the libraries themselves.

  • The similar Haskell platform is not a rousing success. Many from the Haskell community brought up the recent transition toward Stackage and general feeling that the Haskell Platform had a lot of significant problems. However, there are fundamental differences in the way that Rust’s dependency management works that make it hard to know what lessons to draw. As one Haskell user wrote: “the problem that Haskell Platform created was that only a single version of critical packages could exist in the central store at a time.” Cargo doesn’t impose any such restrictions, and indeed a lot of the details of my original post were precisely about the ability to easily juggle versions. Regardless of what direction we take, though, we should learn where we can from Haskell’s experience.

With those concerns in mind, I want to revisit what we’re trying to achieve, and think broadly about how we get there.

Revisiting the goals

Starting at a high level, the goal in my mind is for the Rust community to:

  • Provide commonly-desired libraries and tools;
  • Make them easy to find, acquire, and use;
  • Make them high quality;
  • Make them work well together.

These are all fairly obvious things to work toward, but I want to underscore that lack of maturity in these areas was a major theme in the 2016 Survey, and is also something the core team hears constantly when talking to existing and potential Rust users.

Now, work toward this goal is and will continue to happen organically over time. Maturity takes time, and it’s still early days.

But I think it’s worth looking at the big picture to see how we can rally as a community around these goals, both in terms of building global infrastructure, and in terms of lending muscle to particular tools/libraries. The platform proposal was one attempt to do this, but perhaps an over-reaching one.

So I want to try to focus in on three important things that the platform proposal was trying to do, to look at how else we might get there.

Improving discoverability

“Discoverability” is not just about finding any crate that covers a need. You want to be reasonably confident that the crate is robust, maintained, documented, and that it will work well with the rest of the ecosystem. “Blessing” a small set of crates with these attributes is one way to provide such confidence, but by no means the only one.

Many people responding to the original post proposed improving discoverability on crates.io (or some other site) by providing clearer ranking, and better categorization and search. An example site that already does this for Rust is Awesome Rust.

Of course, finding good heuristics for ranking is a hard problem, and risks cementing first movers. The number of github stars on a project doesn’t necessarily tell you whether it plays well with the ecosystem or is well-maintained. But it’s also conceivable to include a lot of information about a package, e.g. in the form of badges for platform compatibility, or even information about compatibility with other libraries. There’s a spectrum between fully automatic ranking and pure curation, and we can iterate until we find the right balance. Ember Observer was raised as a good example of this kind of mix – though it’s worth noting that the project is an enormous effort.

Improving discoverability on crates.io is forward-compatible with having a small set of “blessed crates” should we decide to go in that direction later on. But it’s a more conservative place to start, and one that benefits the whole ecosystem right away. I think we should do it!

Improving quality

Another aspect of the platform idea was improving quality by focusing attention on a small number of core crates, for which we can provide integration testing, integrated documentation, standardization, and a high level of review/scrutiny.

In at least some of these cases, we might take steps in this direction in an ecosystem-wide way:

  • chriskrycho points out that ongoing work to globally host documentation for crates.io has the potential to provide an integrated docs experience, ecosystem-wide. Ideally, when an API in one crate uses a type from another, the documentation makes that type a link to the other crate. More generally, highest leverage comes from focusing on tools that improve quality across the board.

  • For basic quality measures, like having docs, cross-platform compatbility, CI, and so on, we can use badges or some other way of flagging quality within a crate discovery system, much like Ember Observer.

  • There’s a lot we can do to standardize things like API conventions, documentation, or even the use of Cargo feature flags for crate integration. We did a lot of this kind of work when stabilizing the standard library for 1.0, but there’s no reason we can’t continue to push on it in a more global way. And in fact, there have been recent RFCs doing exactly that for crate documentation. We should be on the lookout for emerging idioms that need standardization, and we should revive efforts to gather guidelines in a central place.

The Rust Ecosystem as a product

One goal of the platform idea was to emphasize that Rust is much more than rustc, Cargo and std. The arrival of a new async IO library can have as much impact as shipping a new language feature. As a community, we need to make sure we’re thinking clearly about this bigger picture, and organizing our efforts accordingly.

I’ve got some thoughts on what that means and how to do it without the “platform” concept, but they really belonged as part of the roadmap discussion. So I’ve added a comment over there.

Where to go from here?

This post was mainly meant to recognize the negative reception, and go back and look at the goals/constraints, exploring some other ideas. I’m hoping we can continue iterating in this design space; I continue to think there’s really important work to be done here.

35 Likes

One heads up: I’m going to be away for the next 1.5 weeks, but I wanted to get out a response to all the feedback before I left. I’m looking forward to jumping back into this discussion soon!

This looks really great @aturon. Thanks for staying on top of the feedback and keeping the conversation moving and focused.

3 Likes

and we should revive efforts to gather guidelines in a central place.

How about we add a conventions directory to src/doc for the Rust repository?

I've already put a loose hanging rustc-ux-guidelines.md in src/doc, so it's not without precedent to have them stored there.

Where does stdx fit in with all of this? Maintenance seems to have stagnated, but it shares a number of goals with the proposed “platform”. Is there some fundamental reason why that is nonviable?

On an unrelated note, I would love central doc hosting. Documentation hosting is really a pain to setup right now. I’m going with the auto-upload gh-pages route, but I decided to have a different upload key for each repo and have them upload to their own gh-pages branches instead of a single docs repository. It seemed like a more robust solution at the time but it’s a pain to set up right now because setting up a repo entails generating a new RSA key, encrypting it with Travis, and modifying the scripts to use it. I could probably write a shell script to automate it but it generally feels like it shouldn’t be a permanent solution.

I've brought this up before, but I think there are language changes that can make this part of the goal easier to accomplish. My goto example here is something like Serde or Diesel deserializing dates. chrono has become the defacto standard for this, but right now the languages forces unnecessary dependencies/coupling there, because of the orphan rule. An implementation of Deserialize or FromSql<DateTime> for chrono::NaiveDateTime has to be in either the crate providing the trait, or in chrono itself. There's no reason for a deserialization or persistence library to care about which lib the community uses for dates. There's certainly no reason for a date time library to care about these concerns. The ability to have a serde-chrono or diesel-chrono library as the glue layer separately would vastly ease this capability, but would require having an escape hatch for the orphan rule.

9 Likes

Seems reasonable (though personally I'd like to see user-facing docs moving out of the main repo). The most difficult part about standardizing conventions is the politics though.

It's conceptually a precursor. At the moment it can't be used as an extended standard library because suitable technology doesn't exist in the stack. Needs cargo metapackages. But it or something like it could fill the same role as the proposed platform metapackage, unofficially.

There's a promising, completely automatic solution for this in the wings by the author of crates.fyi. Not sure if it's been announced publicly yet, but it's going to be pretty nice.

3 Likes

I mean, someone has to care. If we were to say that that someone should be a third party, we run into huge problems where you and I can both act as that third party, splitting the ecosystem, because our libraries are totally incompatible (you can't even build a crate relying on both of them, with no code!) This is even worse than the situation we have right now.

Like always, I think modifying the coherence rules to enable more blanket impls is the solution here. We should have a base library (std presumably) defining the trait DateTime, and serde should define how to Serialize any T: DateTime, operating against that abstract interface. The most effective strategy for this I know of is mutually exclusive traits.

4 Likes

That’s slightly off-topic, but in my opinion it is not a good idea to have a crate expose types that will be used for communications between libraries, or you open the gates to semver hell. See this post which came after half of the ecosystem broke because of this exact problem.

Anything that is part of the public API of a library should be either defined by the library or part of the stdlib. About the situation you exposed, I’m in favor of putting DateTime in the stdlib.

A big +1 - I don’t have much to say (other than liking the crates.io improvement and roadmap ideas, along with related thoughts) because the new proposal is a huge pivot and much leaner. So I want to register my thanks to @aturon for continuing the rust subteam trend of taking feedback on board wholeheartedly and giving a demonstration that I (as a mostly hands-off follower of rust language development) can remain hands off and trust in the subteams.

1 Like

While i agree with this statement, i'd like to point out the following: When comparing the ecosystem of Python and Ruby, i'd like to point out that it is a huge advantage for python to have a good and extensive standard library that fits 90% of small usecases. It's the reason python is often chosen over ruby on projects. You can't always access third party packages easily or have to rely on the distributions packages and having a good and curated standard library in official distribution repositories is a big advantage for many usecases. I'm not saying Rust should ship with third party libraries, but i can imagine to have a rust-stdx package which would potentially become a defacto standard and be distributed by Debian/CentOS/Ubuntu in addtion to rust would be a but plus for rust in such restricted environments.

A bit of a meta-comment: when talking about large standard libraries, people sometimes point to Python as an example of how a large standard library boost adoption (“batteries included”), but one thing people rarely seem to bring up is that it only gained a packaging system fairly late in its lifetime (Python began development in 1991, and the first package-management tool was probably easy_install.py in 2005). People clamoured to add things to the standard library because despite all the downsides, it was still less painful than trying to distribute packages any other way. And sure enough, as Python’s packaging tools have matured, there’s been less pressure to include things in the standard library (although even today, they’re still less mature than, say, Cargo).

As @aturon says, the important things are the existence and discoverability of high-quality third-party libraries. For Python, reaching that goal involved having a large, standardised collection of packages, but Rust+cargo is in a very different place, and that goal may lie in a very different direction.

10 Likes

A amendment to meta-comment: also in Python you must distribute deps to the end user of your program, and that would make a cost of non-standard library dependency higher even if Python had perfect packaging tools.

I understand where you come from, but... how do two libraries communicate with each others then?

By your statement, the only way two libraries can talk to each other is either:

  • exposing std types
  • having the user write scaffolding to translate from one library type to the other library type

The first one is extremely restrictive, the second one is extremely boiler-platey (wasteful, error-prone, and a performance issue).

I am afraid that having a library accept, in its public API, a type created by another library (such as a Matrix) cannot be avoided. To push the reasoning to an extreme, proprietary code will not come put its XyConsumer type in the standard library, and yet many of the libraries defined in their repositories might need to access such a consumer.

Composition of 3rd-party libraries is necessary.


That being said, circumventing the orphan rule may not be the best solution. After all, Rust has 0-cost newtypes.

Today automatic derivation does not work for such newtypes, because it expects the inner type to implement the trait whereas the lack of implementation is precisely the reason why the newtype is necessary in the first place. Still, that's just a short-coming of the derivation today, and maybe it could be solved generically.

(After, there's still the issue of the newtype require wrapping/unwrapping, Deref, AsRef, From, etc... can probably help here)

3 Likes

In terms of improving Cargo search, it may be worthwhile to consider whether a crate includes a repository link, description, or documentation when sorting search results.

1 Like

Java is very successful (especially in the enterprise world) at applying @withoutboats’s suggestion.

J2EE is set of additional industry standards that go beyond the regular SDK (language + stdlib). This contains standardized APIs for many global concerns that would require 3rd party libs to communicate such as - logging, DB access, serialization, dependency injection, etc.

Look at DI for example - first came Spring and than other containers followed, than a common practice and idiom was established and was dimmed worthy of standardization. Java does not actually provide any DI container, but it does define an industry standard API (via its JCP) that 3rd party vendors (Spring) conform to.

Rust could follow a similar path where common interfaces, abstraction and idioms are derived from existing best practices and added to the Rust distribution. I’d also like to adopt the distinction between the stdlib that is the most minimal basic building blocks and abstractions and a more encompassing set of interfaces as a “Rust platform”. In any case, the implementations should still be provided by 3rd party crates and the "Platform " will contain only the trait definitions that are the required glue for libs to communicate.

1 Like

Thank you for really taking the time and effort to incorporate all the feedback with an open mind. This followup post and proposal really solidify my trust in the people behind rust, and the future of the platform.

6 Likes

Not sure if this has already been suggested, but what about having a set of quality criteria for packages to be included in ‘The Platform’ which can be automatically tested?

For example:

  • a minimum level of API coverage
  • a README and Users Guide
  • unit tests
  • standalone sample code
  • integration tests (*)
  • more…?

There should be enough metadata to process this automatically, and crates.io could provide feedback to developers as to which areas need improvement before they can be eligible for inclusion.

This avoids the problem of having to bless a single ‘category leader’ (ie. particular database bindings) and puts all packages on an equal footing. The more these things can be automated, the easier it will be to detect regressions and let them compete on merits.

As for integration testing, any package declaring a dependency on another should not only declare the required version but actively test for compatibility. For example, if chrono depends on serdes then it should test the features it requires.

An automated build system can then track the most recent version for which a build works, and not update the version in the Platform set of packages until all dependent packages build successfully. Obviously this is a non-trivial problem (!) but certainly doable, and doesn’t rely on manual curation or testing.

1 Like

I think we all have to collect some deep statistics about dependencies inside our (not open source) software written in Rust. Which crates uses together? Which versions? Numbers replies to us that such a platform.

I didn’t say this before: I think the Python’s batteries are not great, because some APIs are extremelly awful. Let’s look at Tk, logging, etc. Users was forced to carry it.

I like metapackages idea, It’s cool and retains choice. Often I need something simple like crate reexport feature:

[dependencies]
libc = { version = "...", pub = true }

Yeah, and this is one of many things the platform metapackage could solve. If we end up not providing it in the default installation, this may be one specific problem we do not solve - you'll need to do at least one more step to get the common components necessary to do simple things. But that step might be as simple as adding a one liner to your Cargo.toml pulling in an unofficial collection of common crates. It's not as simple as it could be, but still an improvement over the status quo.

Shipping a platform metapackage by default in some distributions (Debian) but not others (rustup) seems ineffective since nobody will be able to rely on it being available.

This is one avenue yeah, but even if we go this sort of minimal, interface-heavy route, I am always inclined not to literally put (big) new things directly in std. For example, I don't want anything related to databases in std. If we were to define e.g. an ODBC-like interface for Rust and distribute it to everyone, I would want it in a different crate. At the same time, the way we distribute std is basically an artifact of its develeopment process and that it has unstable hooks directly into the compiler. Any crates 'above' std I strongly prefer to distribute through our standard mechanism (cargo).

Yeah, this is something we can do, and part of the intent - having a path from 'new crate' to 'crate listed in an official directory' to 'crate that is an official part of rust' gives us increasing leverage over various quality issues, improves the quality of the whole ecosystem. And as objective criteria it shouldn't cause too much teeth-gnashing about favoritism.

Lots of great feedback here. Thanks, all.

2 Likes