Finding where lexing, ast and hir happen on rustc

Context: trying to learn as much of Rust as I can to help the community one day :slight_smile:

I'm trying to understand how rust compilaiton works for a source file. I've read the lexer crate, AST crate, HIR crate and stopped here. Now I want to know where the source code is lexed, then transformed into an AST and then lowered to HIR

I traced rustc's main function to this part of interest:

On rust/lib.rs at e00893b715ee8142123d8eb2c39050a89bae0e63 · rust-lang/rust · GitHub I found

let linker = compiler.enter(|queries| {
    let early_exit = || sess.compile_status().map(|_| None);
    queries.parse()?;

    if let Some(ppm) = &sess.opts.pretty {
        if ppm.needs_ast_map() {
            queries.global_ctxt()?.peek_mut().enter(|tcx| {
                let expanded_crate = queries.expansion()?.take().0;
                pretty::print_after_hir_lowering(
                    tcx,
                    compiler.input(),
                    &expanded_crate,
                    *ppm,
                    compiler.output_file().as_ref().map(|p| &**p),
                );
                Ok(())
            })?;
        } else {
            let krate = queries.parse()?.take();
            pretty::print_after_parsing(
                sess,
                &compiler.input(),
                &krate,
                *ppm,
                compiler.output_file().as_ref().map(|p| &**p),
            );
        }
        trace!("finished pretty-printing");
        return early_exit();
    }

    if callbacks.after_parsing(compiler, queries) == Compilation::Stop {
        return early_exit();
    }

    if sess.opts.debugging_opts.parse_only
        || sess.opts.debugging_opts.show_span.is_some()
        || sess.opts.debugging_opts.ast_json_noexpand
    {
        return early_exit();
    }

    {
        let (_, lint_store) = &*queries.register_plugins()?.peek();

        // Lint plugins are registered; now we can process command line flags.
        if sess.opts.describe_lints {
            describe_lints(&sess, &lint_store, true);
            return early_exit();
        }
    }

    queries.expansion()?;
    if callbacks.after_expansion(compiler, queries) == Compilation::Stop {
        return early_exit();
    }

    queries.prepare_outputs()?;

    if sess.opts.output_types.contains_key(&OutputType::DepInfo)
        && sess.opts.output_types.len() == 1
    {
        return early_exit();
    }

    queries.global_ctxt()?;

    // Drop AST after creating GlobalCtxt to free memory
    {
        let _timer = sess.prof.generic_activity("drop_ast");
        mem::drop(queries.expansion()?.take());
    }

    if sess.opts.debugging_opts.no_analysis || sess.opts.debugging_opts.ast_json {
        return early_exit();
    }

    if sess.opts.debugging_opts.save_analysis {
        let crate_name = queries.crate_name()?.peek().clone();
        queries.global_ctxt()?.peek_mut().enter(|tcx| {
            let result = tcx.analysis(LOCAL_CRATE);

            sess.time("save_analysis", || {
                save::process_crate(
                    tcx,
                    &crate_name,
                    &compiler.input(),
                    None,
                    DumpHandler::new(
                        compiler.output_dir().as_ref().map(|p| &**p),
                        &crate_name,
                    ),
                )
            });

            result
        })?;
    }

    queries.global_ctxt()?.peek_mut().enter(|tcx| tcx.analysis(LOCAL_CRATE))?;

    if callbacks.after_analysis(compiler, queries) == Compilation::Stop {
        return early_exit();
    }

    queries.ongoing_codegen()?;

    if sess.opts.debugging_opts.print_type_sizes {
        sess.code_stats.print_type_sizes();
    }

    let linker = queries.linker()?;
    Ok(Some(linker))
})?;

I believe here is what the most crucial parts happen because after that, linker.link() is called. Ignoring the prettify block:

if let Some(ppm) = &sess.opts.pretty {

we have left some code to analyze.

I think queries plays an important role here. However, I could not find good description of what a Query is rust/queries.rs at d261df4a72e60e8baa0f21b67eba8f7b91cc2135 · rust-lang/rust · GitHub

Where would the lexing, ast and hir steps happen?

ps: I'm using Intellij Idea's with Control + B shortcut to navigate between things. Don't know if it's the best way to read the source.

Parsing and lexing happens in queries.parse(). Macro expansion of the AST happens in queries.expansion(). Lowering AST to HIR happens when creating the TyCtxt in queries.global_ctxt().

The idea with queries inside rustc_interface/rustc_driver is that each query will perform a compilation step if it hasn't been done yet and returns the result of said step. Each query will also invoke all queries corresponding to compilation steps that need to be done before the query itself.

Have you checked the rustc dev guide?

3 Likes

Just found queries.parse()?; and learned a lot. However, the result is discarded and I could not find any place in the code where self is updated with the result of the parsed crate. Do you know where this happens?

The idea with queries inside rustc_interface/rustc_driver is that each query will perform a compilation step if it hasn't been done yet and returns the result of said step. Each query will also invoke all queries corresponding to compilation steps that need to be done before the query itself.

I don't understand why the result is discarded (unless it's an error)

I think then I can follow queries.expansion by myself.

the dev guide has good information about macro expansion and hir, but almost nothing about the initial parsing and AST :c

(I guess)

queries.expansion() invokes queries.parse() too (through queries.register_plugins) and uses it's result. As queries cache the result, this doesn't result in duplicate work to parse the crate.