Need help with emscripten port


#113

Disclaimer: I’m not very au fait with unwinding, exceptions etc.

Is supporting unwinding actually something we want for rust->asm.js? Exception handling in C++ is pretty bad for asm.js performance and I don’t know if ‘just’ unwinding is any better - you still need to know that you’re out of the normal flow of code and impose the overhead/bloat on every function call. What about this suggestion:

i.e. the emscripten target just doesn’t support unwinding at all. Unlike C++ (where exceptions can be used as control flow) that’s very discouraged in rust, so it feels much more reasonable to make aborting the default and only option?


#114

@rschulman As we discussed on IRC getting the wasm32 target successfully running tests is a good task. Right now the way compiletest launches node, the emscriptened javascript entry point fails to find the wasm file (https://github.com/kripken/emscripten/issues/4542). The fix for this is to have compiletest change the directory before running the test, but this requires snaking a bunch of arguments through the compiletest codebase and ensuring that all the paths involved in the spawned process still resolve correctly.

Another useful task would be to get the asmjs target working without the big LLVM fastcomp backend (https://github.com/rust-lang/rust/issues/36356). We actually don’t need the pnacl legalizer or the asmjs code generator, since we just pass bitcode to emcc which then uses those backends to generate asmjs. But rustc does currently call the LLVM TargetRegistry, passing the asmjs triple, so that LLVM can tell us the right TargetMachine to create. It’s not clear that rustc even needs to create a TargetMachine at all for what we are doing. There are two possible solutions here: delete all the LLVM fastcomp code except that which supports the TargetRegistry; make rustc work without instantiating a TargetMachine.

Myself, I’m still poking at the unwinder(https://github.com/rust-lang/rust/issues/36514), and in turn enabling errors on unimplemented symbols (https://github.com/rust-lang/rust/issues/36515). I pushed a patch to the PR that implements unwinding on top of the C++ APIs that emscripten already suppports and it works in cursory testing. I also have an action item to get the automation set up in our dev environment, but haven’t started.

What does it mean for exception handling performance to be bad in asm.js? Is it the runtime performance of unwinding and catching exceptions or the bloat associated with the landing pads? Does unwinding impose overhead on the non-unwinding path? I imagine it’s similar to the native case; unwinding is just always expensive in code size and run time, but I have seen some decompiled asm.js that looks like there may be extra book-keeping on the non-unwinding path, which would certainly be unattractive.

Unwinding is part of the Rust semantics and we need to support it for the test suite to work in any reasonable fashion if nothing else (otherwise every test would simply abort on failure, which makes for an unpleasant debugging experience when there are multiple test failures). I expect -C panic=abort to work for asmjs targets similarly to others so those that don’t want unwinding don’t need to pay the cost, and I’d expect most projects that want to deploy to the web to use it for the performance / bloat wins.


#115

Oh, I haven’t mentioned on thread here but the jemalloc problems are fixed in my PR as well.

Other good things to work on: fixing test cases, creating compelling demos.


#116

I looked into the impact of unwinding on emscripten-compiled code and indeed for both asm.js and wasm32 it is terrible. Every call that may unwind makes a round-trip through a JS try/catch block.


#117

Other than javascript exceptions, and explicit branching, it should be possible to implement unwinding manually: ie. maintain our own unwind info in a separate stack, and use that to run destructors without actually changing the control flow.

Javascript exceptions would still be used to jump out, but try/catch would only be necessary when catch_unwind is called, since those are the only places that execution can resume.


#118

We could definitely implement a different unwinding scheme on the Rust side. Another variation is return-based unwinding, where every call returns an implicit flag indicating the unwinding path. That would not use the js unwind machinery at all. I’ve always wanted such an alternate unwinding scheme anyway, though I can’t remember the other use-cases offhand… certainly it’s useful in places where zero-cost unwinding doesn’t exist (like emscripten).


#119

PR #36794 will allow setting a default panic strategy for a target. I suggest that the default panic strategy of emscripten and webasm becomes abort (because of unwind’s terrible performances), but that unwind still works if the user chooses to explicitly enable it.


#120

but that unwind still works if the user chooses to explicitly enable it.

This will not work without on-the-fly (re)compilation of std. If you use -C panic=unwind via the command line, your top crate, the executable, will be compiled with panic=abort but it will link against a std that was compiled with panic=abort, our binary release, and that combination doesn’t work. (The other way around: exec is abort and std is unwind does work). Unless you re-compile std with panic=abort.


#121

What we could do is set panic=abort in the target but provide a binary release of std compiled with panic=unwind then executables would “abort” by default and ‘-C panic=unwind’ would also work. The downside is that whatever’s the equivalent of landing pads for emscripten/wasm would still be present in std and the final executable for either panic profile.


#122

I was playing around with this and js-interop to see how much currently was doable. https://gist.github.com/Thinkofname/17e8369b5fcd1d307f5d98dfda9266a6 This actually works but with a few issues.

  • This only works correctly in release mode. I suspect this is because the strings only end up in a format emscripten can find them in release mode due to optimizations
  • I’d prefer js_raw vs js due to the fact not all javascript is valid-enough rust for the macro system to handle but ended up with an issue that emscripten breaks if that string has a new line in it
  • This relies on an internal symbol emscripten_asm_const_int that is only meant to be used by the EM_ASM macro.

Not a major issue as emscripten_run_script works but has a performance hit due to using eval behind the scenes Edit: Or using EMCC_CFLAGS="--js-library lib.js" to go via that route to avoid the eval


#123

I don’t know anything about wasm since I’ve not played with it yet, but just to add some additional info for any readers - asm.js cannot contain try/catch, so the round trip takes the form of calling out of asm.js to a normal js function containing the try/catch, and then back in.

This aside, I think there’s an argument to be made for having at least one first-class platform abort-on-panic by default as a general discouragement of using panics as exceptions. That emscripten would be a bad plaform for unwinding as-is and somewhat-less-bad-but-still-not-great if effort was invested to change unwinding seems like a happy coincidence.


#124

There’s a significant setback to landing the emscripten PR. The mingw build is creating some suspicious linker errors. As far as I can tell the symbols it is complaining about exist, so I have no idea what is going on. My best wild guess is some bug in the mingw linker due to the increased binary size from the fastcomp backend.

Unless anybody has other ideas, one thing we might do to attempt to address this is to make the port work without the fastcomp backend. As I mentioned previously it shouldn’t actually be required in order for us to output the correct bitcode for emcc to use.


#125

I’ve started ripping code out of the fastcomp patch to see how small it can be while still generating the correct IR.


#126

thanks for keeping this thread updated on the status. the amount of time and effort that’s gone into this is exceedingly non-trivial, so thanks to brson and everyone else who have been steadily working on this over the past many months

:star2:


#127

The PR landed and there’s some discussion on reddit: https://www.reddit.com/r/rust/comments/55c9el/working_asmjs_and_wasm_targets_merged_into_master/

Thanks everyone for the help. This is a great milestone but there’s more to do yet.


#128

Here are two more important improvements to emscripten:

  • https://github.com/rust-lang/rust/issues/36899. This is about delegating all optimization to emcc. Right now we use do our own optimization on LLVM IR and call emcc with the default arguments. emcc though can do better optimizations than rustc here so we should instead have it do all optimizations. This will involve understanding emcc’s optimization options and hacking the rustc backend to pass the right options.
  • https://github.com/rust-lang/rust/issues/36900. Emscripten can remove unwinding from LLVM IR itself, so we can use this feature to remove landing pads from the standard library, so -C panic-runtime=abort will produce optimal code. This will involve modifying the panic_abort crate to pass the appropriate linker options to emcc.

#129

My notes on AsmJs so far:

  • Inline asm!("JavaScript here") is badly needed. Currently the JS is placed in an additional lib.js and the path to it is passed in #[link-args]. This turns from “works” to “ugly” once dependencies are introduced.
  • Exporting function via #[link-args] is already bad for a single binary, and horrible with dependencies. It requires the dependency to publicly export all FFI methods and the final crate to import these into these and declare them in #[link-args]. I guess an attribute on exported functions would be a clean solution for this: #[export, no_mangle] fn test_function(ptr: *const u8, len: usize).
  • Alternatively a way to turn Box<FnMut> into JS objects could be found, eliminating the need to call into AsmJs by exported functions.
  • All my code contains fn main() {}.

#130

What do you need inline Javascript asm for that can’t be solved via a call to emscripten_asm_const_int (like this)?


#131

This doesn’t look too bad. Definitely need to try this out! Thanks @cramertj.


#132

Glad I could help!