State of WebAssembly and Rust?

The three most important things it does:

  1. Sets up emscripten on linux
  2. Generates still necessary wrapper js/html to load up the rust code in a browser or node.
  3. Using (2) providing cargo test and cargo start (roughly cargo run).

I've only used stdweb as a consumer so I don't know that I can give you exactly what would improve. However if you compare it to rust-webplatform which uses closures over JSRefs to avoid clones and unnecessary serialization, but its generally ergonomically unusable and not very rusty. I'm not sure what the solution is but it seems like opaque types and some way to tell rust that a value is garbage collected could improve the ergonomics/speed/saftey trade off substantially.

The biggest improvement would be to get a standard JSObject ABI, but that's out of Rust's hands.

1 Like

Ok, thanks for the info! So it sounds like we basically need to enable some mechanism which translates to the --pre-js argument to emscripten, and this will help populate emscripten’s table of “functions you can call via asmjs”, right?

If so, would something like this work?

#[link(name = "foo", kind = "js")] // searches for `foo.js`
extern {
    pub fn foo();
}

When compiling that it’ll eventually translate to --pre-js foo.js passed to the emscripten linker? (bundling the js in the rlib if it’s an intermediate artifact).

One other question I’d have, what does this look like on the JS side? Is it as simple as:

function foo() {
    // ...
}

or do you need other fancy pieces?

Also, it sounds like this would be an asmjs-only feature for now? Or does the emscripten runtime/compiler handle --pre-js in both wasm/asmjs modes?

The "real things" documentation is mostly lacking. On the Rust side, there are these commands to get Emscripten and compile "hello world". But these commands seem to be copied everywhere more or less. In this thread further above someone mentioned that one can build Emscripten with LLVM in release mode. I didn't know that despite having spent quite some time on the general topic (I assumed it is already built in release mode). Thus I always ended up with those 25 gigs of artifacts.

But this also extends to general documentation about the topic. I read the wasm specs and knew about how imports/exports work in theory. I was even able to compile a minimal library with minimal wasm output. But with this foundation knowledge I wasn't really able to understand what the Emscripten rnutime is doing. "How is stuff done in real life?"

Exactly, me neither!

By runtime, I mean the JS part (and I think that this is the common term for it). The JS part is responsible for loading the wasm module, preparing an instantiation environment (which defines all external references/symbols) and instantiating the module (plus optionally calling the main function). If I remember correctly, I wasn't able to reduce the size of that JS file below 1 MB (trying various compiler flags I randomly found on the internet or just guessed).

And again: I only have a vague idea what the Emscripten runtime is doing. As Tomaka said, it emulates a lot of features to make many programs "just work". So the "minimal Rust runtime" would just get rid of all that. We can just require by default that the Rust program doesn't use features not available on the web. Hence we don't have to emulate them and increase the runtime's size. (Having an option to opt-in would be nice, but is not as important I think). Also, partly due to Emscripten's huge runtime, instantiating a wasm module actually takes really long. I can't quite remember, but I think it was at least in the three digit milliseconds. Maybe a second or more?

The "libstd with I/O returning errors" thingy is not part as the runtime (as I (and hopefully everyone) use the term). It's just part of the compilation and linking process which ends up as wasm (not js).

1 Like

That would be good I think.

I think it's as simple as that.

If I remember correctly, passing parameters is not totally straight-forward, but is not really complicated either.

Note that we may want to ask kripken before doing anything. Maybe I'm lacking some knowledge here

I think that --pre-js will work because for now the .wasm file you obtain when compiling for wasm is strongly tied to the .js file that is created alongside with it. But I'm really not sure at all about that.

Again it is still blurry as to how wasm will properly handle WebIDL in the future.

1 Like

The default emscripten runtime is definitely optimized for C/C++ code and in particular POSIX code, so you may be getting a bunch of code you don’t need for Rust.

But it’s also modular. If we can say that Rust doesn’t need filesystem emulation, and doesn’t need streaming I/O (i.e. it can avoid the libc I/O methods and the streaming syscalls they are built on), then most of the default runtime would not be included.

If we can go further and say that Rust doesn’t need libc or other C/POSIX/etc. stuff at all, that would be even better and smaller. A core primitive emscripten provides in C/C++ are EM_ASM blocks, which contain JS code (that the compiler hooks up so that they are called where the EM_ASM is, and params are passed to and back, much like inline assembly, but it isn’t literally inline). If rustc had something like EM_ASM, then all the glue interaction with the page could be done through that, without the need for C system libraries and their glue. This could be super-compact :slight_smile:

I think a big question is what Rust wants to do here. For C/C++ we support POSIX and files etc. by default, given the need to port existing large apps. The sentiment in this thread suggests in Rust maybe the focus should be on a minimal runtime instead? Easy to do on the emscripten side once we define what we want.

11 Likes

I’ve been using webassembly and rust for a couple months now, I’ll explain my usecase and some annoyances.

Currently, the plan of what I am doing is to run web asssembly inside a worker, which handles encryption to idb, indexing of data and basically most of the logic of the web app.

The way I do this is strange, I essentially have a web assembly module with a global singleton upon which functions can be called. The javascript calls the functions on the singleton, returns commands (io or other) the javascript should execute and then the javascript can call the web assembly after executing the commands.

The reasoning behind this approach, is because handling indexeddb, fetch, websockets and worker messages with webassembly seems implausible at the time. Also, a huge benefit is testability because you can mock anything. This also keeps web assembly as small as possible.

The way I communicate is via a JsBytes struct, and serializing with a custom binary type to rust and from rust.

So far, this has worked well and the dom can actually treat the worker like a server, which is fantastic, and since the worker is implemented in web assembly, any computation heavy operations, like compression, decompression, encryption (which I do all except encryption and decryption so far) are fast.

The annoyances and pet peeves have been so far:

  • Actually sending data between webassembly and js (or ReasonMl for me). After figuring out and getting the pattern down it is nice, however initially figuring it out was a pain.
  • Startup time of web assembly, takes several hundred miliseconds.
  • Needing the file system when using just the snappy compression lib (idk why)
  • Compile times on the first compile (70 seconds), rest of the time it is 7ish seconds.
  • Using multiple node versions (see: this), because I use a node dev server and emscripten
  • Maybe more.

Overall however, after figuring out how to transfer data between the two, it isn’t too bad and I get many benefits with the web assembly and running it in the worker.

One nice thing is that I created a custom binary serialization format, for ReasonMl and Rust, it takes some rust definitions of structs and enums and creates a reason and rust file that will serialize to and from those structs. Reason/Ocaml and Rust both have nominal typings and similar enums, so they are a great fit.

For example defining this in rust:

enum Pet {
  Cat(String),
  Dog(String)
}

struct Person {
  name: String,
  age: i32
}

Will generate a module in reason:

module Pet = {
  type t =
    | Cat string
    | Dog string;
  let write v w => ...
  let toBytes v => ...
  let read r => ...
  let fromBytes arr => ...
};

module Person = {
  type t = {
    name: string,
    age: int,
  }
  let write v w => ...
  let toBytes v => ...
  let read r => ...
  let fromBytes arr => ...
}

This will also generate rust types which have similar methods.

For transferring the bytes that get generated, I used JsBytes which is here.

That makes it pretty simple to transfer data, without the context switch of a completely different type system.

Thats all.

2 Likes

Thanks for the info @kripken! This is indeed on my ever-expanding list of things I’d love to explore :slight_smile:

I think for Rust we’ll definitely have a different focus than C/C++ in the sense that I doubt anyone’s porting massive existing codebases in Rust over to the web. Rather I’d expect it to be much more common that new codebases are written for the web! Along those lines I think our main goal in Rust should get “most code” working “most of the time”. That, to me at least, means that libstd compiles and can be linked against and looks like the Linux standard library (mostly).

Note that the standard library here is in contrast to our core library which is the lowest-level library in Rust with various primitives. Things that the standard library has which core doesn’t are:

  • Collections - vectors, hash maps, etc.
  • Pointers - Box, Arc, Rc, etc
  • OS integration - filesystems, TCP, UDP, threads, etc.

I think it’s critical to get collections/pointers working for “most code” to work as-is with the web. The OS integration, however, I think is fine to exist but otherwise just return errors (aka thread::spawn panics). I’d hope that all we need from the emscripten runtime is the ability to allocate/deallocate memory (but we could bring our own if we needed).

Does that make sense in the world of emscripten? Are there perhaps flags we could be passing to emscripten to enable something like this? Or for others reading this, are other parts of the standard library critical to get working?

In terms of EM_ASM, that’s a pretty neat trick! I wonder, is this built with the same facilities as the js! macro mentioned in the stdweb crate above? Looking at that crate it looks like it’s calling functions like emscripten_asm_const_int, which I guess are “emscripten specific intrinsics”?

If what stdweb is doing isn’t the same, how is EM_ASM implemented in C/C++? I’m curious how we might best implement it in Rust. Presumably it’s something we could add to crates.io or even put in libstd!

4 Likes

Emscripten tries very hard to look like a linux machine. It actually migrated from musl being the compatibility layer, to the linux syscall interface being the compatibility layer. So things are implemented in terms of file descriptors etc! As @kripken notes, you could do away with a lot of this and end up with just the core stuff. Indeed, this quote is exactly the question I was raising:

After thinking about it a bit, I think there are two distinct use-cases. First is compiling existing applications to the web, possibly interacting with C code and so on, "the web as a desktop" - I believe this is best served by emscripten itself, mainly because this is such a messy area and you'd end up just recreating the entirety of emscripten. Second is 'greenfield' development, where you're targeting "the web as the web". This is where Rust can win I think, with a libwebstd, possibly the one linked upthread. I would like to move away from libstd though since it has a lot of ties to desktop platforms (a minimal libwebstd would probably make libcollections and liballoc available though) and end up with effectively zero support code. The flipside is you lose support of a ton of Rust libraries, so this is tricky.

For people unfamiliar with emscripten, you can see an example of EM_ASM here and the results you can get from such a simple layer by clicking the top 'Run' button on this page (I confess, this is the only C->JS experience I have, my work has been mostly the other way, from the "web as a desktop" perspective).

1 Like

I was porting a program that used the rustqlite, and thereby needed libsqlite3.

This sounds like we want two different targets. Something like the already existing wasm32-*-emscripten targets plus an additional target with a lightweight/minimal runtime.

With my limited understanding of the ecosystem, I think there are three main parts to this (please correct me if I'm wrong!):

  1. Codegen (Rust to WASM code): can be done by the Emscripten asm.js2wasm transpiler or by the LLVM backend. I think everyone agrees that we want to use the latter in the future (right?). There is already a new experimental target in rustc to use the LLVM backend.
  2. Linking (different modules which come out of codegen, including libcore/libstd): can also be done by Emscripten or by the LLVM linker. The latter is much less usable than the wasm backend, I heard.
  3. Runtime: the JS part which defines the environment for the generated code. Here it seems like the user wants to choose between two alternatives: the "full emulated desktop environment" (which Emscripten provides) or the "minimal browser environment".

I hope all of the above is correct and that it is actually useful to this thread :scream:

Yes, if all you need from system C libraries is malloc/free, then it should already just include that (and not libc or anything else - it only pulls those in if it sees unresolved symbols for something in libc).

Note though that you almost certainly do need some of emscripten's JS runtime code, to do things like load the wasm file, allow running in node.js and not just on the web, set up the stack, set up threads (in asm.js for now, in wasm when it gets threads), JS string utilities, etc. - things which wasm can't do yet. We can make sure that's as minimal as Rust wants (using options like EXPORTED_RUNTIME_METHODS we can set different defaults than C/C++'s).

Rust can also roll its own JS code if it wants for all those, up to you. (but I'd recommend reusing code, e.g. for loading the wasm module it's beneficial to use streaming when available and emscripten's JS does that for you, etc.)

Yeah, I didn't realize it before, but indeed it does look like it's providing the same basic functionality. And those emscripten intrinsics are how we implement EM_ASM internally. Nice :slight_smile:

2 Likes

@bwasty began writing a glTF validator that compiles to WebAssembly. It works OK.


gltf-validator-web

Live demo

1 Like

That's nice to know. My complain above is not justified, then.

I'm working on a Tokio / Hyper based on stdweb + http + futures right now.

Here's an early version:

Also I've been working with Rust's emscripten backends for over a year now and have ported our speedrun timer to TypeScript + React + Rust which is quite a large project by now. Interestingly, I haven't really experienced any problems with it whatsoever. This is probably mostly because in the Rust library I don't actually use any emscripten APIs at all for now. I merely expose a C API there from which I auto generate a high level TypeScript binding (and actually high level bindings for 10 other languages) that I then use with the rest of the code. This works extremely well. Unfortunately this approach has the problem that an object getting collected in JavaScript doesn't clean up its native Rust counterpart, so it's fairly easy to leak memory this way.

Since my codebase now needs to dispatch some HTTP requests, I've been working on this Tokio / Hyper for the browser so I can make my code stay portable across the browser and other more native platforms.

I believe in the long run it may make sense to merge the more common parts of stdweb into std::os::web (or maybe std::os::emscripten) where the compiler can maybe help provide some even better APIs.

4 Likes

For me, it's just how to get a "hello world" running. (I bet I also have questions about "how to do real things" but I've not gotten there yet). I'll give two concrete examples:

After getting emsdk-portable installed, I had to manually modify my $PATH to make 'emcc' available to rust. In the past, I used the copy in emsdk-portable/emscripten/incoming/, but that yielded this error from emscripten:

ERROR:root:Emscripten, llvm and clang repo versions do not match, this is dangerous (1.37.22, 1.37.13, 1.37.13)

This wasn't a terribly hard problem to fix, but I didn't really like having to guess how to modify by $PATH, and it made me wonder if there is some other tool I should be using to manage this (maybe cargo-web does this?)

The second example comes after I got a "hello world" compiled with the wasm target, and I tried to run it:

$ node ./target/wasm32-unknown-emscripten/debug/wasm-test.js

failed to asynchronously prepare wasm: Error: ENOENT: no such file or directory, open 'wasm_test-31e4389890229959.wasm'

At this point, I feel like I'm stumbling in the dark and unsure of the problem (user error? wrong version of node? problem with emscripten? problem with my rust code?) and unsure where to get help.

1 Like

That's a cargo bug (which according to Alex seems to have been fixed a couple of days ago). Your wasm file is in the target/wasm32-unknown-emscripten/release/deps folder. It doesn't get copied out properly due to the hash being there.

1 Like

I also was frustrated with Emscripten installation, especially the weird bugs that sometimes crop up on macOS, and so just published the rust-web-inspired wargo for macOS and Linux. It automatically checks dependencies, installs emcc, configures environment variables, fixes some of those weird bugs, and can run Rust tests in real browsers using Sauce Labs or Selenium!

7 Likes

Hi,

I’ve been looking at using Wasm a few months ago but were not even able to properly install Emscripten. After seeing this post I tried again. First of all, I agree with the messages above: the main pain points for me are the installation of the environment and lack of documentation. I have trouble even running my project so higher level APIs or code size are not a concern for me yet.

My use case is to use Rust to compile a library usable in Javascript, both in Node and the browser.

The Medium post Get Started with Rust, WebAssembly, and Webpack helped me but the Emscripten instructions did not work for me (emcc -v reported errors about fastcomp and a wrong version of clang, it seemed to pick my system clang instead of the one it had built). Even the instructions on the Emscripten website did not seem to be complete. They say to install and activate latest and then configure the environment, but after this step emcc wasn’t even in my PATH (there was no compilation, emscripten wasn’t in the list of installed tools when running emsdk list). I guess that latest referred to the SDK “installer” instead of the actual tools, but it’s confusing.

By mixing the instructions both in the article and Emscripten website, I managed to install Emscripten with:

# Make sure that the `emsdk-portable` directory is in the $PATH before running these commands
emsdk update
emsdk install latest
emsdk activate latest
emsdk install sdk-incoming-64bit
# Activate was missing in the article
emsdk activate sdk-incoming-64bit
# This command was also missing, not sure if it's needed, it seems to just set the env variable `EMSDK`
source emsdk_env.sh

Once this was configured, I was able to compile and run the Hello world exemple of the article as well as the Wargo loader example (it’s almost the same thing).

Starting from there, I wanted to experiment with building a small library. My goal is to port my SWF parser. I have both a Typescript and Rust implementation, I’d like to use my Rust implementation in Node or the browser using Wasm, with the same API as the Typescript version. Basically, I want to learn how write a library able to swap a JS implementation for a Wasm implementation and not break its public interface. In my example, I’d just like to start with a JS interface exposing a single async function parse that takes an ArrayBuffer and returns a JSON string representing the AST. I’d expect it to be not too hard but have no idea how to do it: it seems just a level above the add function in the hello-world example. I am not using any system dependant features, the API only uses buffers and strings.

Here are the questions I have regarding this sort of project, the Medium (and others) example and Wasm in general.

  1. How do I build a Wasm library? I just want a lib.rs instead of a main.rs.

    If I rename main.rs, remove the main function and update "index.js", cargo does not even compile any js file. cargo build --target=wasm32-unknown-emscripten does not even produce a .wasm or .js file. This is maybe related to this issue.

    A pattern I’ve seen on a few websites is to use a main.rs but with an empty main function and the other public functions in this file. This seems like a hack but at least it allows me to expose functions. Still, when running it in my browser I get a warning:

    exit(0) implicitly called by end of main(), but noExitRuntime, so not exiting the runtime (you can use emscripten_force_exit, if you want to force a true shutdown)
    

    This seems caused by {noExitRuntime: true} but there’s no explanation about this option. If I remove it, it crashes. Does main has to run to invoke the other exported functions?

  2. What does the webpack’s loader do? How can I use wasm without it? How can I use wasm in Node?

    Why does Webpack runs the cargo compilation? Can’t I compile my project to wasm beforehand and then use const wasm = require("./my_lib.wasm")? I know that it needs a JS glue for the conversions, but can’t it just let me import the .wasm files and inject the glue automatically? What if I have multiple Wasm modules? Do I need a JS wrapper for each wasm file? Is it possible to load a Wasm file manually without the JS wrapper? I’ve seen WebAssembly.instantiate and new WebAssembly.Module on MDN but it’s lacking complete examples. What does the importer argument for WebAssembly.Module even do? How do I import a Wasm module in Node, do I still need Webpack?

  3. How to structure my code?

    In my ideal world, I would like to publish my library to npm and let others use it without even knowing that it uses Wasm. Did somebody ever do this? How do you add a fallback to asm.js? Does it play well with bundlers used by the consumers of the library? Are there some gotchas when designing the JS API such as memory leaks or initialization that must be done by the consumer? What about module formats, commonJs and ES2015 modules?

  4. How do I maintain Typescript definitions? Typescript is a god-send for JS. Is it possible to emit type definitions for Wasm modules? @CryZe mentioned that some autogeneration is possible. Are there some examples? Is there some documentation?

I don’t expect answers for all these questions. I just wrote them down so you can get an idea of what kind of problems I encounter as a newcomer to Wasm. I really miss an official source documenting how to use Rust for Wasm. I feel as if all we have to work with are some hello-world examples spread accross the internet. A Github repo with a book and examples that could be maintained by the community would be great.

4 Likes

Hello world went well once emscriptem compiled, but for me, rust In webassemblies can’t really shine until we can use multiple threads. That’s rust’s killer feature, and bringing it to the web will be awesome.

Yep, the two things I mentioned there are live. We spoke at RustFest about parity-wasm which is semi-related: the WASM interpreter is functional, although still WIP. Minimal examples of smart contracts written in Rust seem to work, although we haven’t yet tried much more complicated.

1 Like