Rustc dependency policy

nikomatsakis · January 11, 2018, 6:02pm

So, thanks to all the hard work on ./x.py, we have the ability to pull in crates from crates.io and elsewhere. This is wonderful. However, we currently impose some pretty minimal requirements on said crates. As far as I know, we just check that they have suitable licenses.

Recently, rayon encountered some problems due to its use of compiletest-rs, which turns out to be depending (on windows) on the dbghelp library. This is because of miri which uses backtrace. This strikes me as a kind of warning: our set of dependencies is rapidly growing and I don’t know that we are paying any sort of attention. This increases brittleness.

I am wondering if we should consider a whitelist on the transitive crates we employ. I feel like adding a new dependency to rustc should be a “more momentous” (but not necessarily very momentous) decision – perhaps one that we FCP.

Thoughts? Am I over-reacting?

UPDATE: @retep998 opened this issue describing the particular problem rayon hit. To be clear, though, I’m not saying that this problem itself is necessarily a problem, but it does seem to me like the set of dependencies is growing sort of rapidly. Not sure if this is necessarily a problem though.

phaylon · January 11, 2018, 6:06pm

The “dependency allowed to exist” part of this recent post seems relevant. At least in the sense that there is certainly a use-case for this level of control.

bluss · January 11, 2018, 6:09pm

compiletest_rs is just a dev-dependency of rayon, and just intuitively, I don’t think it should be pulled in, in any way.

dan_t · January 11, 2018, 6:21pm

Thoughts? Am I over-reacting?

I don't think so. Every dependency might cause work in the future.

Trying to keep dependencies low certainly is a good idea, but perhaps even more important to ensure that they're maintained.

nikomatsakis · January 11, 2018, 9:15pm

yes, it’s a dev-dependency. The problem we were encountering was travis failures. We use it to check that things are failing to compile when they should. I’m interested in other solutions to that problem, but that’s kind of a separate question.

Mostly what I’m trying to ask here is whether we should try to at least throw up some roadblock to adding new dependencies, so that we have a chance to look at them as they go in. I think we don’t have to be sticklers about it, but I suspect many of them are kind of “accidental”.

In this case, for example, miri depends on backtrace, but that’s not really needed from within the compiler; I think it’s used for prettier errors. We could have potentially made it an optional feature.

(Should we have? Not sure.)

alexcrichton · January 11, 2018, 9:18pm

I would personally be a fan of trying out a whitelist for compiler dependencies (false positives through cargo/rustfmt/rls/etc shouldn’t affect this). The compiler dependencies end up being much trickier in practice I think than rls/cargo/etc because we continue to use them to generate programs, whereas cargo/rls are just standalone programs.

I’m also starting to get worried about our build times for the compiler itself. Whenever we put more dependencies in the compiler those are crates we have to build and it can take longer and longer to do that. I’ve just done some analysis which shows that we’re clearly regressing on how long it takes to bootstrap rustc over time, it’s just not clear to me what’s actually causing the problem. I don’t think pulling in tons of crates from crates.io help much, however.

Could we perhaps try an explicit whitelist for awhile of allowed crates? See how it goes? If it’s too onerous we can reconsider and otherwise I’d agree that having a double check for “are you sure you wanted to add this dependency?” would be a good idea.

nikomatsakis · January 12, 2018, 9:29am

I'm in favor of trying it out. Is it hard to do? =) I'd hate to add more onto your plate personally, I'm not really following who hacks around the build system though.

alexcrichton · January 12, 2018, 3:18pm

Hm it won’t be too hard but I think it will be nontrivial.

We’ve already got dependency checking which verifies license, so it should be mostly just adding an array next to EXCEPTIONS for a whitelist (and then checking the names of directories and such). The trick is that we only want to verify the dependencies of rustc itself, so we’d have to run cargo metadata to learn about the dependency graph. The build system already does that though so the logic could perhaps just be moved there.

mark-i-m · February 22, 2018, 4:48pm

Coming back to this: I think this is a huge security hole in the style of Reflections on Trusting Trust.

nikomatsakis · February 22, 2018, 5:03pm

Would you be up for adding a whitelist, @mark-i-m? @alexcrichton laid out some directions here.

mark-i-m · February 22, 2018, 5:11pm

I guess it’s more of a question of vetting. I don’t think a whitelist is sufficient. Before we upgrade any dependency we would want to make sure nothing fishy was added. Moreover, we don’t want to transitively trust huge numbers of crates, so we would need to then vet dependencies of dependencies, and so on (i.e. the whole DAG)…

That sounds like a huge effort, but it does come with a lot of benefits:

we know exactly what we depend on and what we don’t
we might get ideas for where to trim some of the excess weight
it incentivizes not adding dependencies -> faster builds, smaller binaries

I’m happy to try to start this, but my bandwidth is a bit low ATM, so it might be kind of slow…

nikomatsakis · February 22, 2018, 5:15pm

With respect, that sounds like a whitelist. =) Yes, the whitelist is not just the name of the crate, but the revision. And yes, it is transitive.

nikomatsakis · February 22, 2018, 5:16pm

Well, I guess – there is some debate about how to set it up

But at minimum it does seem like it should include the revision

mark-i-m · February 23, 2018, 2:01am

mark-i-m · February 24, 2018, 12:00am

Wow… we have a lot of dependencies:

* advapi32-sys 0.2.0
* aho-corasick 0.5.3
* aho-corasick 0.6.4
* alloc 0.0.0
* alloc_jemalloc 0.0.0
* alloc_system 0.0.0
* ansi_term 0.10.2
* ar 0.3.1
* arena 0.0.0
* atty 0.2.6
* backtrace 0.3.5
* backtrace-sys 0.1.16
* bin_lib 0.1.0
* bitflags 0.7.0
* bitflags 0.9.1
* bitflags 1.0.1
* bootstrap 0.0.0
* borrow_error 0.1.0
* bufstream 0.1.3
* build-manifest 0.1.0
* build_helper 0.1.0
* byteorder 1.2.1
* cargo 0.26.0
* cargo_metadata 0.2.3
* cargo_metadata 0.4.0
* cargotest 0.1.0
* cargotest2 0.1.0
* cc 1.0.4
* cfg-if 0.1.2
* chrono 0.4.0
* clap 2.29.0
* clippy 0.0.186
* clippy-mini-macro-test 0.2.0
* clippy_lints 0.0.186
* cmake 0.1.29
* coco 0.1.1
* commoncrypto 0.2.0
* commoncrypto-sys 0.2.0
* compiler_builtins 0.0.0
* compiletest 0.0.0
* compiletest_rs 0.3.6
* completion 0.1.0
* core 0.0.0
* core-foundation 0.4.6
* core-foundation-sys 0.4.6
* crates-io 0.15.0
* crossbeam 0.2.12
* crossbeam 0.3.2
* crypto-hash 0.3.0
* curl 0.4.11
* curl-sys 0.4.1
* deglob 0.1.0
* derive-new 0.5.0
* diff 0.1.11
* dlmalloc 0.0.0
* docopt 0.8.3
* dtoa 0.4.2
* duct 0.8.2
* either 1.4.0
* endian-type 0.1.2
* enum_primitive 0.1.1
* env_logger 0.3.5
* env_logger 0.4.3
* env_logger 0.5.3
* error-chain 0.11.0
* error-chain 0.8.1
* error_index_generator 0.0.0
* failure 0.1.1
* failure_derive 0.1.1
* features 0.1.0
* filetime 0.1.15
* find_all_refs_no_cfg_test 0.1.0
* find_impls 0.1.0
* flate2 1.0.1
* fmt_macros 0.0.0
* fnv 1.0.6
* foreign-types 0.3.2
* foreign-types-shared 0.1.1
* fs2 0.4.3
* fuchsia-zircon 0.3.3
* fuchsia-zircon-sys 0.3.3
* futures 0.1.17
* getopts 0.2.15
* git2 0.6.11
* git2-curl 0.7.0
* glob 0.2.11
* globset 0.2.1
* graphviz 0.0.0
* hamcrest 0.1.1
* handlebars 0.29.1
* hex 0.2.0
* hex 0.3.1
* home 0.3.0
* idna 0.1.4
* if_chain 0.1.2
* ignore 0.3.1
* infer_bin 0.1.0
* infer_custom_bin 0.1.0
* infer_lib 0.1.0
* installer 0.0.0
* is-match 0.1.0
* itertools 0.6.5
* itertools 0.7.6
* itoa 0.3.4
* jobserver 0.1.9
* json 0.11.12
* jsonrpc-core 8.0.1
* kernel32-sys 0.2.2
* languageserver-types 0.30.0
* lazy_static 0.2.11
* lazy_static 1.0.0
* lazycell 0.5.1
* libc 0.0.0
* libc 0.2.36
* libgit2-sys 0.6.19
* libssh2-sys 0.2.6
* libz-sys 1.0.18
* linkchecker 0.1.0
* log 0.3.9
* log 0.4.1
* log_settings 0.1.1
* lzma-sys 0.1.9
* matches 0.1.6
* mdbook 0.1.2
* memchr 0.1.11
* memchr 2.0.1
* miniz-sys 0.1.10
* miow 0.2.1
* miri 0.1.0
* multiple_bins 0.1.0
* net2 0.2.31
* nibble_vec 0.0.3
* nix 0.8.1
* num 0.1.41
* num-bigint 0.1.41
* num-complex 0.1.41
* num-integer 0.1.35
* num-iter 0.1.34
* num-rational 0.1.40
* num-traits 0.1.41
* num_cpus 1.8.0
* open 1.2.1
* openssl 0.9.23
* openssl-probe 0.1.2
* openssl-sys 0.9.24
* os_pipe 0.5.1
* owning_ref 0.3.3
* panic_abort 0.0.0
* panic_unwind 0.0.0
* parking_lot 0.5.3
* parking_lot_core 0.2.9
* percent-encoding 1.0.1
* pest 0.3.3
* pkg-config 0.3.9
* proc_macro 0.0.0
* profiler_builtins 0.0.0
* pulldown-cmark 0.0.15
* pulldown-cmark 0.1.0
* quick-error 1.2.1
* quine-mc_cluskey 0.2.4
* quote 0.3.15
* racer 2.0.12
* radix_trie 0.1.2
* rand 0.3.20
* rayon 0.9.0
* rayon-core 1.3.0
* redox_syscall 0.1.37
* redox_termios 0.1.1
* reformat 0.1.0
* reformat_with_range 0.1.0
* regex 0.1.80
* regex 0.2.5
* regex-syntax 0.3.9
* regex-syntax 0.4.2
* remote-test-client 0.1.0
* remote-test-server 0.1.0
* rls 0.125.0
* rls-analysis 0.11.0
* rls-blacklist 0.1.0
* rls-data 0.15.0
* rls-rustc 0.2.1
* rls-span 0.4.0
* rls-vfs 0.4.4
* rustbook 0.1.0
* rustc 0.0.0
* rustc-ap-rustc_cratesio_shim 29.0.0
* rustc-ap-rustc_data_structures 29.0.0
* rustc-ap-rustc_errors 29.0.0
* rustc-ap-serialize 29.0.0
* rustc-ap-syntax 29.0.0
* rustc-ap-syntax_pos 29.0.0
* rustc-demangle 0.1.5
* rustc-main 0.0.0
* rustc-serialize 0.3.24
* rustc_allocator 0.0.0
* rustc_apfloat 0.0.0
* rustc_asan 0.0.0
* rustc_back 0.0.0
* rustc_binaryen 0.0.0
* rustc_borrowck 0.0.0
* rustc_const_eval 0.0.0
* rustc_const_math 0.0.0
* rustc_cratesio_shim 0.0.0
* rustc_data_structures 0.0.0
* rustc_driver 0.0.0
* rustc_errors 0.0.0
* rustc_incremental 0.0.0
* rustc_lint 0.0.0
* rustc_llvm 0.0.0
* rustc_lsan 0.0.0
* rustc_metadata 0.0.0
* rustc_mir 0.0.0
* rustc_msan 0.0.0
* rustc_passes 0.0.0
* rustc_platform_intrinsics 0.0.0
* rustc_plugin 0.0.0
* rustc_privacy 0.0.0
* rustc_resolve 0.0.0
* rustc_save_analysis 0.0.0
* rustc_trans 0.0.0
* rustc_trans_utils 0.0.0
* rustc_tsan 0.0.0
* rustc_typeck 0.0.0
* rustdoc 0.0.0
* rustdoc-themes 0.1.0
* rustdoc-tool 0.0.0
* rustfmt-nightly 0.3.8
* same-file 0.1.3
* same-file 1.0.2
* schannel 0.1.10
* scoped-tls 0.1.0
* scopeguard 0.1.2
* scopeguard 0.3.3
* semver 0.6.0
* semver 0.8.0
* semver 0.9.0
* semver-parser 0.7.0
* serde 1.0.27
* serde_derive 1.0.27
* serde_derive_internals 0.19.0
* serde_ignored 0.0.4
* serde_json 1.0.9
* serialize 0.0.0
* shared_child 0.2.1
* shell-escape 0.1.3
* shlex 0.1.1
* smallvec 0.6.0
* socket2 0.3.0
* stable_deref_trait 1.0.0
* std 0.0.0
* std_unicode 0.0.0
* strsim 0.6.0
* syn 0.11.11
* synom 0.11.3
* synstructure 0.6.1
* syntax 0.0.0
* syntax_ext 0.0.0
* syntax_pos 0.0.0
* syntex_errors 0.52.0
* syntex_pos 0.52.0
* syntex_syntax 0.52.0
* tar 0.4.14
* tempdir 0.3.5
* term 0.0.0
* term 0.4.6
* termcolor 0.3.3
* termion 1.5.1
* test 0.0.0
* textwrap 0.9.0
* thread-id 2.0.0
* thread_local 0.2.7
* thread_local 0.3.5
* tidy 0.1.0
* time 0.1.39
* toml 0.2.1
* toml 0.4.5
* toml-query 0.6.0
* unicode-bidi 0.3.4
* unicode-normalization 0.1.5
* unicode-segmentation 1.2.0
* unicode-width 0.1.4
* unicode-xid 0.0.3
* unicode-xid 0.0.4
* unreachable 1.0.0
* unstable-book-gen 0.1.0
* unwind 0.0.0
* url 1.6.0
* url_serde 0.2.0
* userenv-sys 0.2.0
* utf8-ranges 0.1.3
* utf8-ranges 1.0.0
* vcpkg 0.2.2
* vec_map 0.8.0
* void 1.0.2
* walkdir 1.0.7
* walkdir 2.0.1
* winapi 0.2.8
* winapi 0.3.4
* winapi-build 0.1.1
* winapi-i686-pc-windows-gnu 0.4.0
* winapi-x86_64-pc-windows-gnu 0.4.0
* wincolor 0.1.4
* workspace_symbol 0.1.0
* ws2_32-sys 0.2.1
* xattr 0.1.11
* xz2 0.1.4
* yaml-rust 0.3.5

cuviper · February 24, 2018, 12:04am

Are we trying to control that whole list, or just those that are specifically built into rustc?

mark-i-m · February 24, 2018, 12:29am

That’s the full list of dependencies from cargo metadata --manifest-path rust/src/Cargo.toml…

cuviper · February 24, 2018, 12:43am

Yeah, I figured it was something like that, but that includes dependencies of every in-tree tool too. For instance, xz2 is only used by the installer to generate tarballs; it’s not in anything we ship to users.

Maybe we do want to track all that fully – that’s what I’m asking.

mark-i-m · February 25, 2018, 1:15am

Hmm… that’s a great question. @nikomatsakis @alexcrichton ?

This does raise an interesting question: is xz2 actually ok to use in the compiler? The whitelist prototype based on cargo metadata can’t differentiate…

matklad · February 25, 2018, 6:20am

You can use “resolve” field of metadata to lerarn about precise dependencies of each package

Topic		Replies	Views
Warning: dependencies! language design	9	1425	December 18, 2020
@notriddle's Rust 2020 wishlist, or, Rust 2021: Integration community	10	2044	February 3, 2020
More useful errors when no_std crate depends on std compiler	2	328	September 12, 2024
Idea: Light-weight reusable dependencies tools and infrastructure	13	1198	October 30, 2022
Compiler and language stability guarantees instead of LTS	26	1930	August 19, 2024

Rustc dependency policy

Related topics