Typecheck all the things?


#1

TL;DR: Could the compiler typecheck, borrowcheck, … all the code sections, even those which are cfg’ed away and will not make it into the final binary?


State of the art

The compiler parses all the code, however as far as I know it only performs type unification, borrow checking, etc… on the code that will make it into the final binary.

This is trivially demonstrated by using a #[cfg(test)] section: cargo build will not report errors in the test, cargo test will.


The problem

When writing platform specific code, this code is only verified when compiling for the specific platform. When writing featured code, this code is only verified when compiling with the specific feature enabled.

This means that simply checking that all the bits of code are correct (as far as the compiler is concerned) requires compiling for all variations of platforms and featured code.


The question

Could the compiler perform all the type checking, borrow checking, etc… on all the code?

This would of course introduce some overhead, however:

  • this would only introduce overhead for those hacking on projects which use cfg, and only in direct proportion to the size of cfg’ed code,
  • this overhead is less than independent invocations of the compiler for each combination of platform and feature,
  • it is my understanding that type checking and borrow checking are not the bottleneck in the compilation process anyway.

Note: it might even solve the issue of double-reporting of errors when invoking cargo test

Thus, would it possible? And if possible, does anyone else find it desirable?


#2

This would be nice.

What do you think it should do for a crate with unstable features built using a stable compiler? Currently the following can be built with a stable compiler as long as the unstable feature is not selected. It seems undesirable that this would only compile if it matches the way specialization worked at the time of the stable Rust release you are using.

#![cfg_attr(feature = "unstable", feature(specialization))]

trait Trait {
    fn f();
}

impl<T> Trait for T {
    #[cfg(feature = "unstable")]
    default fn f() {}

    #[cfg(not(feature = "unstable"))]
    fn f() {}
}

fn main() {}


#3

As described, I don’t see how this can work. Even if cfg stripping is delayed until typechecking while compiling a crate, dependencies (already-compiled crates) will have been narrowed down to one particular configuration. For example, all std APIs that are not available on the platform you’re compiling on presumably won’t be in the metadata, and thus won’t be available for metadata and. (This is a mechanical limitation which could may be solvable in principle, but would increase the implementation cost further.)

I also have another concern: Different cfgs are not just a matter of having additional items defined, there are often multiple colliding definitions guarded by mutually exclusive definitions. For example,

#[cfg(windows)]
fn create_temp_file(path: &OsStr) -> RawHandle { ... }
#[cfg(windows)]
fn create_temp_file(path: &OsStr) -> RawFd { .. }

Not only are there name collisions, the signatures are also completely different (note that OsStr is also defined differently depending on the OS). So you need to figure out how to do name resolution in the presence of such collisions, and how to pick the right variant in each context. Note that there might be complicated non-local dependencies, for example there might be a non-cfg'd function that does use_temp_file(create_temp_file(get_os_str())) where some of the called functions have different definitions depending on cfg.

While Ive seen too much to claim this is impossible, it seems like a massive complications of all parts of the compiler (that have to interact with this issue) without any neat way to compartmentalize it. A cross-cutting concern, so to speak.


#4

That’s a good point.

Given the unstable nature of unstable features (by definition) I think it would be necessary to have a way to either denote them (so that errors become warnings) or explicitly exclude them from the checks.

Yes, this means that multiple versions of the clients of such mutually exclusive definitions would have to be typechecked.

This is exactly why I’m wondering if this is even possible to start with.

Beyond speed, massively complicated the compiler means increasing maintenance effort and slowing down development of any further feature.

I am not sure how complex, or how easy to compartmentalize this is. It just seems to me that massive gains could be achieved compared to separate compilations simply because much shouldn’t change and therefore could be shared.

I only know it would seem sweet from a user POV though; so maybe it’s just way too complicated in the current rustc architecture, or even just too complicated in general.


#5

The space of possible configuration blows up exponentially with respect to the number of flags, and moreover libraries generally aren’t designed to cope with certain mutually exclusive flags, so you have to manually specify the cfgs to test.

Maybe it could be added as a cargo subcommand:

cargo check --fake-cfg=windows,foo
cargo check --fake-cfg=unix,!foo

and then users can run them in their CI with some selection of cfgs. They are still run separately, so it’ll be slow, but at least you don’t need a real Windows/*nix machine.


#6

This PLDI 2013 paper is addressing the similar problem: Statically analyzing software product lines in minutes instead of years. Certainly not a mature, “rusty” idea, but as a research, why not?