Moving WebAssembly support forward

Everything you’ve just said equally applies to traditional ISAs. And yet, we still use separate object files and linkers. Sometimes compilation speed is more important than execution speed. Other times, the reverse is true, and that’s when you’ll turn to whole program optimization.

LLVM bitcode is not always the best format for object files. For one thing, it is not compatible between LLVM versions. For another, webassembly toolchains need not all be LLVM-based. I’m sure that gcc will soon implement a wasm backend as well. Surely, we’ll want to be able to mix toolchains and languages in the same program.

1 Like

Well there are various reasons to use object files / must use object files for traditional ISAs that aren’t an issue for wasm; but this is starting to get far afield :laughing:

So, I’m pretty sure that rust stores bitcode and HIR/MIR in .rlibs for each crate. This together with the final compilation stage should be sufficient to generate a single wasm output, (also enabling incremental compilation), as I said, without the need for a linker.

The only scenario I imagine it would be useful is in the case of wasm generated from a native .c library, and it isn’t possible to include it in the compiler IR obviously.

Eventually I would like to be able to have Rust and JS interact through Rust’s FFI functionality, but there’s some design work to do to determine what the JS glue code should look like. For example, there’s a trade off between how much set up work the emitted code does and how flexible its interface is. This is complicated by the fact that there is no standard mechanism for dynamically linking wasm modules yet. However, some of this design work needs to be done before we even start thinking about FFI, because JS will be needed to provide even more basic functionality required by the compiler intrinsics and std. I will write up a proposal for this design (as an RFC?) once I am able to experiment with the possibilities locally.

1 Like

@m4b you can find more information about WebAssembly object files and linking here: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md

1 Like

Interesting, thanks for the link! I think this is why I couldn’t find it on webassembly.org and the spec, because it’s not official and in a different repo.

So I had no idea they were adding relocations and other stuff; will be interesting to see how they mess it up and the world has another broken relocation system :wink:

This paragraph causes me some alarm though:

These conventions are not part of the WebAssembly standard, and are not required of WebAssembly-consuming implementations to execute WebAssembly code. Tools producing and working with WebAssembly in other ways also need not follow any of these conventions.

It’s an interesting avenue for sure though, we’ll see how it turns out.

My comments about not needing a wasm linker for most cases still stand though, I think.

Also if you do go the object file everything route, will std in the target be shipping as a wasm object file? how is this going to work with generics, also?

Awesome summary @tlively. Thank you.

I think we can start working toward distributing lld with Rust, and using it for an experimental wasm backend. This is a long-term goal for many reasons (cc @japaric). I’d suggest we name it rust-lld, distribute it with every host configuration (even when unused) and put it in /bin though I know @alexcrichton is inclined to put it in rustlib/$target/bin like we do with gcc on windows.

And we can go ahead and initiate the LLVM upgrade at any time. We have to agree with @kripken on an LLVM commit and have him do the emscripten upgrade while we start the rust LLVM upgrade. I’ll send an email to both of you to make sure he’s aware of this thread.

@tlively How close is your port to being feature complete? After thinking a bit more I’m a little concerned that it could be premature to upgrade llvm right now. I have the impression that the wasm backend and lld are not fully working yet, so it seems quite likely we will be in the position of doing another llvm upgrade in short order before users can really use the target. Do you think that would be the case?

@brson Yes, I think that would definitely be the case. I think it would be best for me to work locally and find any obvious bugs before we go through all the work of upgrading LLVM upstream. This will give @sbc100 time to upstream his work on lld as well. I think we will be in a better place to do the upgrade in about two weeks.

That being said, perhaps we can start an upgrade now, because subsequent upgrades would only be small incremental changes. That might be too much overhead, though.

I think that might be worth clarifying. Emscripten contains a bunch of things, one of which is fastcomp and the asm.js backend there. Maybe that’s what you mean by “emscripten” in that sentence?

Moving Rust to use the wasm backend can avoid the fastcomp dependency (which would certainly be nice!) but I’d recommend you still use emscripten to drive the wasm backend, as it provides a lot of things (system libraries, Web API integration, etc.) that otherwise you’d need to do all yourself.

Btw, while getting rid of the fastcomp dependency is nice, getting rid of the LLVM dependency might be even nicer :wink: which is what mir2wasm does. That should be even simpler than the wasm backend (fewer and smaller dependencies), and as a bonus should easily win on compile times. Would be nice to see experimentation in both.

Back on topic, for the shorter-term issue here, if rust wants to keep support for the asm.js backend while adding wasm backend support, then we’d need to update LLVM in fastcomp. I can assist there, but don’t have time to do the LLVM upgrade itself (unless this can wait a while until I do).

1 Like

Actually, it looks like there is. I don’t know if quality of the emitted code is comparable to emscripten, but maybe it’ll be enough to fill the gap for now?

Another thing we could do is to have a special build of rustc based on emscripten’s LLVM, while the regular build uses newer LLVM and supports asm.js via wasm2asm. Unfortunately, this would put even more load on Rust’s CI infrastructure.

2 Likes

Sadly that wasm2asm tool is not in a usable state. It was an experiment which ran into some issues, and we’ve hoped someone would have time to work on it some more, but that never happened.

I agree, emscripten provides a lot of stuff for free. However, using emcc as a linker requires us to have bitcode-level compatibility with emscripten, which restricts our choice of LLVM version we can use. Things would be easier if emscripten grew support for accepting wasm modules as inputs. Is that feasible?

The other approach we could take, is to leverage emscripten at toolchain build time to create wasm versions of libcompiler-rt, libc and libm, but take care of linking the final executable ourselves (by directly invoking lld). This way end-users wouldn’t need to install emscripten at all, as long as they compile pure-Rust programs. Well, I am sure I am glancing over some important details here, like “Who’s going to emit the .js bootstrapper for wasm modules?”, so maybe we’ll need to include parts of emscripten too.

Yes, I think that’s the right approach, and sbc100 is working on it, see https://github.com/kripken/emscripten/pull/5313

Why not link it into rustc, like we do with LLVM? Seems like it’s somewhat nicer to have the entire thing in one executable, for one, no issues with escaping, command line lengths, …

2 Likes

Exactly. If we used stock lld, then that might make sense. But if we are linking the library anyways, yeah just skip the boilerplate of multiple processes.

If we do go the separate process route, it absolutely should not be in a target-specific directory as all things llvm are multi-target.

@kripken, is there a way to coax emcc into compiling its runtime libraries for wasm without giving it any source or bitcode files? I think I’d rather invoke lld for wasm objects myself, but would like to reuse the work you’ve done around runtime libs.

Doesn’t the lld version need to match the clang version in order for clang’s LTO to work for C/C++ parts of a project?

You can use the embuilder.py tool to build system libraries. However, those are bitcode or wasm files, and you also need to build the bitcode ones into wasm, and also need to generate the JS runtime support for them. Emscripten support for wasm input files will make that part possible too, basically you’ll be able to build wasm however you want (e.g. using lld), then invoke emcc on the wasm to generate all the system lib and runtime support you need.

How do I use it embuilder?

“WASM=1 embuilder.py build wasm_compiler_rt wasm-libc” says “WARNING:root:wasm_compiler_rt not built when using JSBackend”

“EMCC_WASM_BACKEND=1 embuilder.py build wasm_compiler_rt wasm-libc” errors out:

CRITICAL:root:WebAssembly set as target, but LLVM has not been built with the WebAssembly backend, llc reports:
===========================================================================

    js     - JavaScript (asm.js, emscripten) backend
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

===========================================================================

Also, where does it output the files it builds?

Any idea what’s the timeline for this? I’d like to experiment with wasm backend in parallel with lld’s wasm support being finalized.

I don’t think WASM=1 in the env has an effect, but maybe I forgot something (is it mentioned in the docs somewhere?). The warning is saying that that library is not needed when using the asm.js backend.

That env var means it tries to use the wasm backend, but the wasm backend is not built with emscripten’s default LLVM (since it isn’t stable yet). Instead you’d need to build LLVM (with that backend enabled, I think they don’t have it enabled by default, it’s experimental), and point emscripten to use that (e.g. by setting the LLVM env var).

It should say after it builds where it placed the outputs, they will be in the cache dir, usually something like ~/.emscripten_cache/...

I’m not sure, might want to ask in that PR.