Wasm llvm upstream target: Implementing stubbed out functionality inside libstd


#1

Meta: This is not an RFC as I don’t believe the change needs one. Instead it is just a proposal and if neccessary parties agree on it, we should implement it. If you disagree, see this as a pre-RFC. I’m open to opening an RFC :).

cc @koute @alexcrichton

Motivation

The current standard library and the wasm32-unknown-unknown target has most of its functionality stubbed out. There is no pthreads support by wasm yet, so some functionality has to remain stubbed out for unknown periods of time. But it is possible to provide support for other functionality, like println!, panic messages, or obtaining the time, with current Web APIs. These functionalities all require us to call into JS however, and here is the problem: The unknown target has no official functionality where crates can say “these js functions need to be provided as imports”.

The js! macro by the stdweb crate provides the needed functionality: it allows Rust code to specify the js code it needs. cargo-web then comines all the js code of the used Rust crates.

The way how I understand stdweb achieves this is by putting the js code as string literals into the generated web assembly file, and cargo-web parses the web assembly file and extracts the literals.

Resulting story for users

println! and other basic functionalities provided by std will work like on other targets, you can just use them.

Maybe in the future cargo itself can provide support but I want to keep my proposal minimal, so for the time being you’d be required to use cargo-web if you want to use std togethe with the wasm32-unknown-unknown target. cargo doc, cargo check etc. should all work without cargo-web integration.

If you know of any use cases where you want std but don’t want it to have working functionality, please speak up! Then we can push the current stubbed-out but dependency free std to crates.io.

This move will help to establish cargo-web as standard tool. This allows third party OS abstraction crates like rand or glutin to rely on its presence and use the js! macro themselves to get e.g. seeding from js. I think this is the most important benefit of my proposal.

Detailed changes

Basic idea: std should be using the js! macro itself to provide support for println! etc.

Changes for stdweb:

The main functionality of the js macro seems to be provided through the webcore module. Therefore, I think the webcore module should be factored out of stdweb into a separate crate/module that only depends on std itself. stdweb itself would then depend on that crate.

Changes for upstream Rust

There are two alternatives here: either, add webcore to the sysroot as full crate, and do extern crate webcore; in std’s lib.rs, or use it as non-public submodule of std. I guess with both proposals you’d include webcore as a submodule of the tree.

The two approaches each have their own advantages and disadvantages.

  • extern crate webcore; requires webcore to be #![no_std] and webcore is not able to use functionality from std.
  • the submodule approach is a bit more ugly

If its possible to make webcore not use std, one should go with the extern crate route.

Note that webcore would always remain an implementation detail of the standard library: you would still have to do webcore = ... or stdweb = ... inside your Cargo.toml if you wanted to use the js macro.

Once this is done, the test runners should be updated to use cargo-web instead of cargo.

Creation of embedding specific targets

As @rpjohnst and @shepmaster suggest, we should create several targets based on wasm32 for different environments, that each differ in their libstd but are otherwise unchanged. E.g. wasm32-rust-web for client side usage in web browsers, wasm32-rust-node for usage with node.js. “non-web embeddings” (plugin engines for example) could continue usage of the unknown-unknown target.

I suggest we start with a -web target.

Use by js templating tools

There are templating tools like webpack in the js world you can use to create client side artifacts. These tools should now use cargo-web instead of cargo and include the js emitted by cargo-web into their process.


#3

I see hooks like panic as a better way to provide the functionality.

The js! macro looks minimal to use; the code in stdweb+cargo-web to get it working is far from. If you don’t write custom JavaScript code you get a sub-optimal solution. The code is rewriting the generated wasm to inject a memory growth handling call. I wonder if similar could be done for stubbed out functions.

println! should allow customisation when web. i.e. console.log is not always the only answer.


#4

Thanks for the proposal @est31! That said, I must agree with @jonh in that the code to get the js! macro to work is far from simple (it’s far more complex than I’d like it to be!), and I think that having the std to depend on it as-is is not the way to go. (There’s just too much stuff and complexity in there.)

Yes, std itself should use something like the js! macro, but in a significantly more cut-down fashion. (So even more minimal than what you propose.) Basically, I think something like this would perhaps be reasonable:

  1. Compiler would supply a builtin basic_js! (or something like that) akin to js! macro in embed_js - basically a minimal macro which corresponds 1-1 to what WebAssembly supports. (So no string serialization, etc.) Every call to basic_js! would generate a unique function import in the resulting WebAssembly code.
  2. Compiler would define a very minimal ABI (although it’s not really binary, so maybe API would be a better word?) which it would export and/or import. (Basically, export malloc and free, export main instead of setting it as a start function, import a hook into the memory growth, etc.)
  3. Along with the .wasm file the compiler would dump every JS snippet from every basic_js! call into a separate file.
  4. Then an external tool (e.g. my cargo-web, or even cargo itself in the future, or something else entirely) could take that bundle of JS snippets and generate a .js file, without having to touch the .wasm at all.

Basically, the idea would be to make it as minimal as possible and as unopinionated as possible (different people have different ideas on how the resulting .js file should look and act).


#5

Going even more minimal is definitely a good idea, maybe I have underestimated the complexity of the js! macro. That being said I never wanted to expose js! inside the standard library, only using it as an implementation detail. Your proposal seems to fit better though. I agree with it! Implementing the macro in the compiler allows us to do things like proper name mangling for the function name and similar.

Compiler would define a very minimal ABI

integers, floats, and raw pointers, does that sound good?

Along with the .wasm file the compiler would dump every JS snippet from every basic_js! call into a separate file.

Good idea! I suggest json as a format, with the functions as string literals each. That’s easiest to parse for everyone. It is a bit tricky though as we need to put the exported functions for any non-final crate into the generated metadata.

take that bundle of JS snippets and generate a .js file, without having to touch the .wasm at all.

Linkers as well as llvm can apply optimisations that eliminate dead calls. So you still should touch the wasm, in order to check that the imported function is still being called, and only emit those functions in the end. Just saying.

The important thing is to get all the frameworks to work together: wasmblock, embed_js, std-web/cargo-web. Nobody should have to invoke all three of the tools these frameworks provide because different crates in the dependency chain use different frameworks. Instead, each crate should be free in its chosen framework while having one unified tooling story, allowing the tools by the frameworks to understand each other.


#6

I see hooks like panic as a better way to provide the functionality.

Hooks require manual involvement. In order to provide a minimal friction experience, we should give reasonable defaults and an easy way to override those defaults. Hooks sound like a nice idea but creating hooks for everything would be quite tedious I think. I don’t think they should be included into this minimal proposal. They better fit into a separate discussion about how to make the default choices as easy to override as possible.


#7

I don’t agree that the burden should be on Rust. Wasm needs a libc one way or another, just to support the use of C / C++. I think Rust’s standard library should rely on libc like it does on other platforms, rather than having a whole disjoint implementation just for wasm. The work should be put into implementing libc, not providing a special Rust implementation. This is especially useful to help facilitate porting other languages to wasm.


#8

What if, instead of using a macro, we had a small number of pre-defined js functions that std uses? We could include them conditionally based on the wasm module imports.


#9

Hi, I don’t know much about the Rust internals, but I did run into this issue with wasm as a user and would love to see it resolved. Aside from println! and obtaining the time, I ran into a problem with math functions not being found for the wasm32-unknown-unknown target (e.g. sin, cos, exp, atan2). Even the modulus % operator doesn’t work with floats. According to the WebAssembly FAQ

WebAssembly doesn’t include its own math functions like sin, cos, exp, pow, and so on. WebAssembly’s strategy for such functions is to allow them to be implemented as library routines in WebAssembly itself (note that x86’s sin and cos instructions are slow and imprecise and are generally avoided these days anyway). Users wishing to use faster and less precise math functions on WebAssembly can simply select a math library implementation which does so.

It would be nice if these functions were available like they are for other targets.