FOSS hell -> std lib evolution


#1

As a hobby project, I have developed a work related tool in Rust and my company is now looking into the possibility to ship it to our customers.

However, there is a major issue with it, something we call the “FOSS hell”. My company has a policy that, for each dependency and for Rust std lib itself, it needs a FOSS analysis and approval to use, which involves checking the license, checking the source code to make sure it does not break export restrictions in regards to strong encryption etc.

More than that, the approval to use is valid only for a specific version (e.g. 1.1) and has to be repeated for new major or minor versions (e.g. 1.2).

This takes a huge amount of time, mainly due to the amount of small crates that are very common in the Rust world (I pasted the list of dependencies my tool has). I think this will become a major problem for Rust in the enterprise area.

I think the std lib has to grow faster by incorporating 3rd party libraries, e.g. http client, ssl support, logging, regex, good serialiser/deserialiser with long term support plans etc.

What is the long term plan for Rust in regards to std lib evolution?

Here is my list of dependencies:

unicase v1.1.0
threadpool v0.1.4
unicode-width v0.1.3
pkg-config v0.3.6
httparse v1.1.0
strsim v0.3.0
matches v0.1.2
language-tags v0.2.0
regex-syntax v0.2.2
gcc v0.3.20
typeable v0.1.2
libc v0.2.2
memchr v0.1.7
ansi_term v0.7.0
libz-sys v1.0.0
openssl-sys-extras v0.7.1
cmake v0.1.11
bitflags v0.1.1
winapi-build v0.1.1
aho-corasick v0.4.0
kernel32-sys v0.2.1
envvar v0.1.2
libssh2-sys v0.1.34
rustc-serialize v0.3.16
openssl-sys v0.7.1
regex v0.1.43
unicase v1.1.0
winapi v0.2.5
time v0.1.34
advapi32-sys v0.1.2
traitobject v0.0.1
lazy_static v0.1.15
ws2_32-sys v0.2.1
strsim v0.4.0
log v0.3.4
bitflags v0.3.3
openssl v0.7.1
vergen v0.0.16
rand v0.3.12
env_logger v0.3.2
num_cpus v0.2.10
hpack v0.2.0
solicit v0.4.4
ssh2 v0.2.10
docopt v0.6.78
uuid v0.1.18
num v0.1.28
url v0.5.0
cookie v0.2.2
serde v0.6.6
mime v0.1.1
hyper v0.7.0

#2

If you combine multiple crates into one, you’re not actually reducing the amount of code so the effort required to review the combined crate should be no more than reviewing the invidual crates. The same holds for intergrating crates into std.


#3

First, those crates (regex, etc…) aren’t part of libstd. Currently, libstd doesn’t depend on anything outside of the core rust repository.

As for licenses and export restrictions, you’re just going to have to check every time. Hopefully we can eventually get some form of cargo ianal to check if the claimed licences are compatible but that doesn’t guarantee that the authors of any crates haven’t ripped off someone else’s code and this won’t tell you whether or not the BIS has been notified of any contained crypto (note: Open source projects only need to notify the BIS that they are using crypto. They don’t need to register/get permission).

As a matter of fact, splitting things into multiple crates makes things easier because

  1. You don’t need to re-verify shared dependencies.
  2. On update, you only need to re-verify updated dependencies.

#4

I agree the amount of code does not change but the perception does change. It’s one thing to use 50+ dependencies and ask the FOSS team to check them all and another thing to use 1 or 2 dependencies.

Also, one dependency (i.e. the std lib) means one license so there is no risk that e.g. some GPL code is used in a ‘sub-dependency’ even though an Apache license is used in the dependency.


#5

That policy may work for your legal team but it’s not true in general. Debian, for instance, tends not to take an upstream at their word about the license, and checks each file in case someone copied a random helper library with a conflicting license into the project (which definitely happens frequently enough to make this worthwhile!). And many other companies’ legal teams don’t need to re-check each minor version, or more commonly don’t need to go through this process at all.

Personally, I suspect that this sort of thing is going to be best handled by a company who can bundle up a bunch of dependencies and sell you a support contract. They can also watch for stability, which is the primary reason these crates aren’t in libstd, and handle security updates. If you or other companies would be interested in paying for this, you might be able to encourage such a product to exist. But it seems like bundling this into libstd wouldn’t make sense for a long while, and I expect the Rust open-source project’s priorities are not well-aligned with building a non-libstd bundle of common stuff right now, either. (Though perhaps eventually there might be something like the Haskell Platform.)


#6

This really seems to be an issue with your policy (though I agree that we could have better tooling around this) rather than an issue with the stdlib. IIRC Rust wants to keep its stdlib lean (even having “official” stdlib-ish crates like regex on crates.io); this doesn’t seem to be enough justification for it not to.


#7

On the topic of crate sizes, there is a separate but related issue of social of how Conway’s law affects crates. Currently crates are very small which has advantages but means that crates tend to be managed by one person in their spare time. Or are crates small because its mostly individual people managing the code? There appear to be precious few highlight projects (rustc and servo -neither of which use cargo at the top level afaik; autotools for rustc; Mach for servo) to gather around. Each of these have potentially their own coding standards, issue trackers, etc. And a lot of the bugs will be in the interactions of the crates. So with the Cambrian explosion of crates the issues of how to nail down bugs due to interactions increases.

One example: I had a bug in an iron crate where the HEAD http request on a static file was returning the body of the document. And because it was an issue with static files iirc I submitted the issue there; but on further analysis the owner of the crate said it was a problem somewhere else so if I would resubmit the ticke he would appreciate it. Thankfully they were both crates by the same person. If it were a problem between two crates who have different ideas of how things should be done and no guiding project then there’s no reason for them to move forward together since their project ‘works’. (cf systemd reimplementing the universe because nothing ‘works’ the way they need).

I think the pain felt by @themax is another part of the story (though I disagree with the term FOSS hell and would call it ‘compliance hell’) but the solution to both is probably a way to promote teamwork between crates so people are working together more (or at least in a way that’s more obvious to newcomers). It’s probably a social issue that most people dont feel because they are strong independent developers, but maybe that just becomes survivor bias. Maybe a technical solution can bring people and crates together (chicken) or a social structure can, through Conway’s law, imply that appropriate tooling is built (egg).

(Sent from phone so please accept my apologies for spelling issues and autocorrect foibles.)


#8

Servo does use Cargo and has lots of little crates.io dependencies, it just wraps it around in Mach.

I am not sure how much bugs spreading across multiple small crates is worse than bugs in a big partially-maintained conglomerate library, but I think @reem only asked you to repost your bug because he owned both relevant crates.

Still, an official set of curated crates is an item the library team is working on, and that should bring some official order to the “crate wild west”.


#9

However, there is a major issue with it, something we call the “FOSS hell”. My company has a policy that, for each dependency and for Rust std lib itself, it needs a FOSS analysis and approval to use, which involves checking the license,

Rust’s license doesn’t change. At most this should be a one-time event for each dependency, plus every time a license changes (a rare event).

checking the source code to make sure it does not break export restrictions in regards to strong encryption etc.

This is bizarre. Unless your company is exporting to “rogue states” or making military C&C software, no US export restrictions apply. But even if they did, it’s just another one-time license.

And you only need to register if you actually use cryptography in your end application, which every product that uses TLS technically does. So another one-time event.

More than that, the approval to use is valid only for a specific version (e.g. 1.1) and has to be repeated for new major or minor versions (e.g. 1.2).

It sounds like this policy exists mostly to give your FOSS review team jobs. It’s a waste of time and money, so complain to your manager and get credit for cost-cutting.