Prevent source parse with proper bytecode

hydroper1 · March 4, 2023, 12:33pm

Everytime, when a Cargo project is first built, all dependencies have to be compiled from source to an internal format. Then subsequent builds don't need dependencies built again; aka. incremental compilation.

What if Rust supported its own bytecode format, which retains all the language semantics, allowing Cargo to skip building dependencies by sharing bytecode to crates.io? For example, different from LLVM bitcode or WebAssembly, the Rust's bytecode format would store conditional compilation attributes (#[cfg]), env!(...), features, macros (including their inner concat!, env!, include!, include_bytes! etc.), module items... everything from Rust. Then the program build phase can import and export bytecode.

So supporting Rust's proper bytecode would speed up the first build phase. I'm not sure how much that'd benefit, because in some ways the bytecode has to be verified like the sources... but there are many differences... like, it'd avoid parsing the Rust syntax, would strip indent characters and more... (maybe would retain documentation comments for use by IDEs and RustDoc).

build.rs

If a crate has a build.rs, its build.rs should execute before its bytecode is reused. The bytecode must be able to reuse build.rs artifacts... so include! and several other macros have to stay unresolved at the bytecode level.

When macros clearly don't rely on build.rs artifacts, they can be resolved at the bytecode level...

This doesn't prevent compilation exactly

Dependency crates are still incrementally compiled, except their source isn't parsed. Only build.rs is parsed, maybe.

Other uses

Importing and exporting a proper bytecode has an advantage. It allows manipulating Rust programs in other ways, including allowing it to be embedded in other software without low-level WebAssembly or LLVM bitcode.

The main point of this idea about preventing source recompilation isn't clear whether it's worthy; it might improve parsing speed if the bytecode format is more compact and easier to parse than the Rust syntax.

Nemo157 · March 4, 2023, 2:36pm

The parsing phase does not take much time, testing on tokio (75kLoC, 35kSLoC):

> cargo rustc -- -Ztime-passes
time:   0.001; rss:   40MB ->   43MB (   +3MB)  parse_crate
time:   0.106; rss:   48MB ->   97MB (  +49MB)  expand_crate
time:   0.106; rss:   45MB ->   97MB (  +52MB)  macro_expand_crate

Maybe the first two steps could be done before publishing, I'm not entirely sure what expand_crate does, the third step is already going to be applying conditional compilation so must be delayed till actual build. So that's either 1ms or 107ms that could be saved (less whatever the load time for the custom serialization is).

bjorn3 · March 4, 2023, 6:04pm

expand_crate is a subpass of macro_expand_crate. That is the code structure is roughly sess.time("macro_expand_crate", || { /* stuff */ sess.time("expand_crate", expand_crate); /* stuff */ }); where both /* stuff */ are very quick.

github.com

rust-lang/rust/blob/276b75a843af8822ffe4e395266d9445679a57a4/compiler/rustc_interface/src/passes.rs#L191


      
          pre_expansion_lint(sess, lint_store, resolver.registered_tools(), &krate, crate_name);
          rustc_builtin_macros::register_builtin_macros(resolver);
          
          krate = sess.time("crate_injection", || {
              rustc_builtin_macros::standard_library_imports::inject(krate, resolver, sess)
          });
          
          util::check_attr_crate_type(sess, &krate.attrs, &mut resolver.lint_buffer());
          
          // Expand all macros
          krate = sess.time("macro_expand_crate", || {
              // Windows dlls do not have rpaths, so they don't know how to find their
              // dependencies. It's up to us to tell the system where to find all the
              // dependent dlls. Note that this uses cfg!(windows) as opposed to
              // targ_cfg because syntax extensions are always loaded for the host
              // compiler, not for the target.
              //
              // This is somewhat of an inherently racy operation, however, as
              // multiple threads calling this function could possibly continue
              // extending PATH far beyond what it should. To solve this for now we
              // just don't add any new elements to PATH which are already there

github.com

rust-lang/rust/blob/276b75a843af8822ffe4e395266d9445679a57a4/compiler/rustc_interface/src/passes.rs#L239


      
              trace_mac: sess.opts.unstable_opts.trace_macros,
              should_test: sess.opts.test,
              span_debug: sess.opts.unstable_opts.span_debug,
              proc_macro_backtrace: sess.opts.unstable_opts.proc_macro_backtrace,
              ..rustc_expand::expand::ExpansionConfig::default(crate_name.to_string())
          };
          
          let lint_store = LintStoreExpandImpl(lint_store);
          let mut ecx = ExtCtxt::new(sess, cfg, resolver, Some(&lint_store));
          // Expand macros now!
          let krate = sess.time("expand_crate", || ecx.monotonic_expander().expand_crate(krate));
          
          // The rest is error reporting
          
          sess.parse_sess.buffered_lints.with_lock(|buffered_lints: &mut Vec<BufferedEarlyLint>| {
              buffered_lints.append(&mut ecx.buffered_early_lint);
          });
          
          sess.time("check_unused_macros", || {
              ecx.check_unused_macros();
          });

system · June 2, 2023, 6:04pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Proposal: Add "cargo:rustc-compile-crate-without-waiting-for-build-rs" for build.rs compiler	11	1030	November 28, 2022
Faster compilation by trusting a 3rd party for type-checking information tools and infrastructure	3	600	October 2, 2022
Adding compiler version to dependency resolution	3	969	November 14, 2019
`package.rs` script to reduce build dependencies of published crates cargo	15	808	December 22, 2024
How about add some options to cargo that compiles only "used" library or mod or fn? compiler	4	603	August 11, 2023

Prevent source parse with proper bytecode

build.rs

This doesn't prevent compilation exactly

Other uses

Related topics