Good place to start a new codegen

I want to try making a new platform target: .NET (ECMA Common Language Runtime), and preserve as much of high-level information from Rust as possible (e.g. types, including traits, lifetimes, etc) in the metadata. Would writing a new codegen be appropriate approach or does codegen operate on a level of IR that is too low level for that purpose already?

A follow up question is (if a new codegen is the right way): what is a good starting point for a new codegen? Ideally, I'd like to see a simple toy codegen that does not perform any optimizations whatsoever and produces either text assembly or a really simple low-level intermediate bytecode, which could be forked and adapted for the above purpose. Or just the smallest existing functioning codegen and some documentation on its implementation details.

Lifetimes are always erased at codegen level. A backend will also have to monomorphize unless the target can represent the entire type system of rust (.NET can't. Even something as simple as associated types fail) Additionally rustc's layout calculations only work on monomorphic types and you have to exactly match what rustc does to make const eval work. Preserving types for locals is likely also not what you want as memory in rust is untyped unlike C/C++. What I think you can do is just monomorphize everything like normal in rust and treat type as a fixed size list of bytes and then for the external interface with C# define a .NET type for each used monomorphic instance of rust types, have a list of bytes as internal representation and expose getters and setters for all public fields on this type.

If you want to start from the ground up you can make a dylib crate with the following template as start and work your way up from their.

extern crate rustc_codegen_ssa;

// This prevents duplicating functions and statics that are already part of the host rustc process.
#[allow(unused_extern_crates)]
extern crate rustc_driver;

use rustc_codegen_ssa::traits::CodegenBackend;

struct DotNetCodegenBackend;

impl CodegenBackend for DotNetCodegenBackend {
    // ...
}

/// This is the entrypoint for a hot plugged rustc_codegen_dotnet
#[no_mangle]
pub fn __rustc_codegen_backend() -> Box<dyn CodegenBackend> {
    Box::new(DotNetCodegenBackend)
}

and then use the various helpers in rustc_codegen_ssa to save yourself time. If you want to start from a template there are 3 existing backends I can recommend:

I hope this helps. If you have any questions about how certain things work, feel free to ask them at the rust lang zulip (https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp) I'm also called @bjorn3 on there.

2 Likes

Sorry, just clarifying:

Lifetimes are always erased at codegen level

You mean that codegen input contains lifetimes, and codegen output generally does not, right?

I am actually most interested in traits: my biggest grudge is inability to invoke them from debuggers.

The codegen input is monomorphized MIR. Monomorphization erases all lifetimes.

You could create a class for each trait and make the exported .NET types derive from it if it is implemented in the local crate. It just works in a subset of the cases (for example no associated types or generics on trait members) and you can't directly use it within the codegened msil/cil for the rust functions. Only for C# and debugger usage. Another thing is that traits can be implemented for foreign types and I don't think .NET allows adding a super class to existing types.

You might be interested in Llama:

https://ericsink.com/entries/llama_rust_013.html

As I understand it, it isn't open source.