Moving WebAssembly support forward

I agree, emscripten provides a lot of stuff for free. However, using emcc as a linker requires us to have bitcode-level compatibility with emscripten, which restricts our choice of LLVM version we can use. Things would be easier if emscripten grew support for accepting wasm modules as inputs. Is that feasible?

The other approach we could take, is to leverage emscripten at toolchain build time to create wasm versions of libcompiler-rt, libc and libm, but take care of linking the final executable ourselves (by directly invoking lld). This way end-users wouldn't need to install emscripten at all, as long as they compile pure-Rust programs. Well, I am sure I am glancing over some important details here, like "Who's going to emit the .js bootstrapper for wasm modules?", so maybe we'll need to include parts of emscripten too.

Yes, I think that's the right approach, and sbc100 is working on it, see Add initial support for using lld rather than s2wasm in the wasm backend by sbc100 ¡ Pull Request #5313 ¡ emscripten-core/emscripten ¡ GitHub

Why not link it into rustc, like we do with LLVM? Seems like it's somewhat nicer to have the entire thing in one executable, for one, no issues with escaping, command line lengths, ...

2 Likes

Exactly. If we used stock lld, then that might make sense. But if we are linking the library anyways, yeah just skip the boilerplate of multiple processes.

If we do go the separate process route, it absolutely should not be in a target-specific directory as all things llvm are multi-target.

@kripken, is there a way to coax emcc into compiling its runtime libraries for wasm without giving it any source or bitcode files? I think I’d rather invoke lld for wasm objects myself, but would like to reuse the work you’ve done around runtime libs.

Doesn’t the lld version need to match the clang version in order for clang’s LTO to work for C/C++ parts of a project?

You can use the embuilder.py tool to build system libraries. However, those are bitcode or wasm files, and you also need to build the bitcode ones into wasm, and also need to generate the JS runtime support for them. Emscripten support for wasm input files will make that part possible too, basically you'll be able to build wasm however you want (e.g. using lld), then invoke emcc on the wasm to generate all the system lib and runtime support you need.

How do I use it embuilder?

"WASM=1 embuilder.py build wasm_compiler_rt wasm-libc" says "WARNING:root:wasm_compiler_rt not built when using JSBackend"

"EMCC_WASM_BACKEND=1 embuilder.py build wasm_compiler_rt wasm-libc" errors out:

CRITICAL:root:WebAssembly set as target, but LLVM has not been built with the WebAssembly backend, llc reports:
===========================================================================

    js     - JavaScript (asm.js, emscripten) backend
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

===========================================================================

Also, where does it output the files it builds?

Any idea what's the timeline for this? I'd like to experiment with wasm backend in parallel with lld's wasm support being finalized.

I don't think WASM=1 in the env has an effect, but maybe I forgot something (is it mentioned in the docs somewhere?). The warning is saying that that library is not needed when using the asm.js backend.

That env var means it tries to use the wasm backend, but the wasm backend is not built with emscripten's default LLVM (since it isn't stable yet). Instead you'd need to build LLVM (with that backend enabled, I think they don't have it enabled by default, it's experimental), and point emscripten to use that (e.g. by setting the LLVM env var).

It should say after it builds where it placed the outputs, they will be in the cache dir, usually something like ~/.emscripten_cache/...

I'm not sure, might want to ask in that PR.

Ah I didn’t think of that. I assume lld doesn’t link clang but just thebotcode must match? Do we need rust-clang and rust-lld?

I like having in a separate process because for debugging it's sometimes useful to drive the compiler and linker separately, but then again I'm rarely doing this kind of debugging myself, so if rustc devs want it to be in-process then I don't care much.

Note that we already run assembler in-process instead of running separately. In theory, -C no-integrated-as runs assembler separately although I am not sure how well it works… We could have -C no-integrated-ld too.

1 Like

HI everyone, I just wanted to share an update on the status of this work.

Wasm support in lld is still not complete or stable, though it is much further along than before. As such, it hasn’t made sense to start the LLVM upgrade that would be a prerequisite for emitting wasm native object files from rustc.

I’ve also been thinking more about how to add in the JavaScript glue and libc support and everything else emscripten provides, and it seems to me that for users who need a full POSIX environment with a file system and everything else, emscripten will continue to be the most complete solution and we should still support it. As such, once lld stabilizes more, I plan to update wasm32-experimental-emscripten to emit native wasm objects files and throw those over the fence to emscripten, treating emcc like a normal linker that also happens to add any necessary JavaScript glue. This will make wasm32-experimental-emscripten work the same as any other native target, except that it will use emcc instead of cc for the final linking. I have this working locally, so once LLVM is upgraded this will be a quick update. Another benefit of this approach is that lld does not need to be in the Rust tree, since it will be distributed with and driven by emscripten.

However, I do still think that having the toolchain depend on emcc even for users who don’t depend on full POSIX support on the web is overkill. @vadimcn has been working on pulling the essential JavaScript libraries out of emscripten so they can be distributed statically as part of the WebAssembly rustc component. We don’t yet have a solid story for how or where to emit the JS glue, but I would like to see that functionality be as small and simple as possible. This type of support for wasm would require lld to be in the Rust tree in one form or another.

In the meantime, I am going to be working on getting the test suite to pass on the current wasm32-experimental-emscripten. Now that https://github.com/rust-lang/rust/pull/43175 has landed, you can run the disabled wasm-exp builder locally to see what’s passing or failing. Make sure to have experimental-targets = WebAssembly in the LLVM section of your config.toml if you try this.

6 Likes

FWIW, I concur with @brson: it is much easier to debug linking problems when you can run linker as a separate tool.

Also, what would integration of linker buy us?
In the case of assembler, the benefit is that we can avoid serializing in-memory representation of machine instructions into a text format only to have assembler immediately convert it back into a similar form. For linking, most of the inputs had already been serialized to files, so we wouldn’t be gaining nearly as much.

Thanks for the update @tlively. I agree with your assessment that we should continue deferring to emcc for projects that want to target an emulated POSIX environment.

Tangentially, I’ve heard from @wycats that he needs to be able to generate no_std wasm binaries that are very minimal and contain no extra binary gunk that strictly necessary, and has had problems in the past doing so. ISTM that it should be possible to do a no_std build against emcc without pulling in any more of the emscripten runtime and glue than necessary, but I don’t know. A non-emscripten wasm target would also be good far that use case.

2 Likes

It's awesome to hear that there is interest in doing this! I'd personally love to see WebAssembly move from primarily porting Linux C++ apps to writing components natively for the Web.

3 Likes

I'd also be interested in generating minimal, no_std wasm binaries. I agree that a non-emscripten target seems sensible for this -- what steps are needed to get this working? @tlively I'd be happy to help out if there's anything I can do to help here.

The best way to generate wasm from Rust without emscripten right now is to use rustc to generate llvm bitcode files, use llvm-link to link them together into one big bitcode file, use llc to compile that to to the wasm .s format, then translate that to a wasm binary with Binaryen’s s2wasm. This is exactly the process that emscripten drives in the current wasm32-experimental-emscripten target, but doing it manually doesn’t get you any of libc, other fundamental functions, or JS glue. Although no_std code shouldn’t need libc, it will still need some of the other stuff that Emscripten provides.

I know @vadimcn is working on pulling out this critical runtime into a standalone package that would be usable without emscripten, but his work is based on the newer process that uses the LLVM backend and lld. Since this process is not yet stable and upstreaming a target that uses it would be a lot of work as long as Rust needs Fastcomp, none of this work is immediately upstreamable.

A standalone package of the necessary runtime components based on the current s2wasm process would satisfy this use case but would become obsolete as soon as lld wasm support is stabilized.

For this reason I plan to work in the meantime on getting wasm2asm working so Rust can support asm.js without depending on Fastcomp. This will make upstreaming the lld-based target and @vadimcn’s work easier.

For now I would continue using wasm32-unknown-emscripten for all your Rust to wasm needs. Emscripten should be pretty good about eliminating all unnecessary code from a no_std binary.

6 Likes

Hi everyone! My internship is over, so I just wanted to leave an update on the state of WebAssembly support in Rust and what I’ve been up to for the past few weeks.

Last time I left an update, I was ready to have the wasm32-experimental-emscripten emit native WebAssembly object files, but there were a lot of things that needed to happen first. LLD’s WebAssembly support needed to stabilize and Emscripten and Rust needed to upgrade to a recent version of LLVM. Although I had a prototype working locally, I couldn’t ask for upgrades to be merged in until LLD had stabilized. I was stuck, so I pivoted to working on something else. Specifically, I’ve been working on the wasm2asm tool in Binaryen. This tool translates WebAssembly to asm.js and will allow Emscripten to target asm.js using the LLVM WebAssembly backend instead of Fastcomp. This would break Rust’s dependency on Fastcomp, making it much easier to upgrade Rust’s LLVM in the future while still allowing rustc to target asm.js. The work is unfortunately not finished, but it’s much farther along than it was, and I’d like to keep working on it if I have time.

Meanwhile, progress has been made on other fronts. Rust updated to LLVM 5 while I was working on wasm2asm and WebAssembly support in LLD is being actively discussed and reviewed. Some of the follks working on GHC, the Haskell compiler, have also been able to extract some of the JS runtime out of Emscripten. Overall things are moving slowly but surely where we want them to go.

With that, I thought I would leave instructions for using the wasm32-experimental-emscripten target in case anyone wants to play around with it or hack on it. The instructions are a bit complicated and you will need to build Rust from source because it depends on functionality that is not built into LLVM by default and because Emscripten is highly sensitive to changes in your environment. These instructions are only known to work on Linux.

Build Rust

  1. Clone the Rust repo and check out master

  2. Write the following config.toml in rust top level directory. The important parts are the wasm32-experimental-emscripten build target and the experimental-targets = “WebAssembly” LLVM option. You should probably set codegen-units equal to however many cores you have on your machine.

[build]
target = ["x86_64-unknown-linux-gnu", "wasm32-experimental-emscripten"]

[rust]
codegen-units = 48
debug-assertions = true

[llvm]
assertions = true
experimental-targets = "WebAssembly"
optimize = false
ccache = true
  1. Compile rust (as quickly as possible). Or do a full build if you don’t mind waiting.
cd rust
./x.py build --stage 1 src/libtest

Set up environment

  1. Download last known good build of wasm binaries by using or following this script:
BUILDNO=$(curl -fL https://storage.googleapis.com/wasm-llvm/builds/linux/lkgr.json | jq '.build | tonumber')
curl -sL https://storage.googleapis.com/wasm-llvm/builds/linux/$BUILDNO/wasm-binaries.tbz2 | tar xvkj

This will ensure you have Emscripten and all the LLVM tools it needs to target wasm using the native LLVM backend by downloading them directly from the WebAssembly testing infrastructure.

  1. Install a wasm-ready nodejs like node 8

  2. Update PATH

export PATH=/path/to/node-v8.1.2-linux-x64/bin:/path/to/wasm-install/emscripten:/path/to/wasm-install/bin:$PATH

It is important that you have wasm-install/emscripten before wasm-install/bin in your path, otherwise Emscripten will not pick up the correct configuration file.

  1. Write the emscripten configuration file ~/.emscripten. You must use full paths. Note: only the wasm32-experimental-emscripten target will work properly with this emscripten config, not wasm32-unknown-emscripten or asmjs-unknown-emscripten. This file points Emscripten to all of the other tools it depends on.
EMSCRIPTEN_ROOT = '/path/to/wasm-install/emscripten'
NODE_JS='/path/to/node-v8.1.2-linux-x64/bin/node'
LLVM_ROOT='/path/to/wasm-install/bin'
BINARYEN_ROOT='/path/to/wasm-install'
COMPILER_ENGINE= NODE_JS
JS_ENGINES = [NODE_JS]
  1. Populate the emscripten caches
echo 'main(){}' > a.c
emcc a.c -s WASM=1

Compile and Run

  1. Compile a Rust program, hello.rs, to wasm
rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --target wasm32-experimental-emscripten hello.rs
  1. Finally, run it with node
node hello.js

It will fail due to LLVM bug #33824, which affects LLVM bitcode produced by rustc but not clang and which I never had time to fix. This bug will will become a non-issue once native WebAssembly exception handling is implemented.

7 Likes

Thanks so much for all your work on this @tlively! I look forward to the day that we no longer have to pull in Emscripten and we can get even leaner wasm executables!

Out of curiosity, is there a good forum/issue/etc to follow for all this? I’d be curious to kick some tires when things are “ready” but I’m not sure how to get notified when things have reached such a state.

1 Like