Enabling threads on wasm32-unknown-enscripten?

I've recently learned that the wasm32-unknown-emscripten target doesn't (yet) support threads. Since Emscripten as a platform does support threads, I thought I might start to poke around rustc to see what it might take to it working. But I could use a hint or two to a good starting point to dive in. I'm not sure if this would be something that would have most of the effort on the rustc side of things, or if there would be something needed on the LLVM side.

Here is a brief overview of where I've started so far. I cloned the rust git, ran the x.py build script as a shakeout, and it built without issue. Then I went and haphazardly changed the file:

rust/compiler/rustc_target/src/spec/wasm32_unknown_emscripten.rs

..adding a line to the TargetOptions to include "singlethread: false" (see here), to override the option set in wasm_base.rs, just to see what might happen.

    // Wasm doesn't have atomics yet, so tell LLVM that we're in a single
    // threaded model which will legalize atomics to normal operations.
    singlethread: true,

That comment about atomics is interesting, since I need to have rustflags set to include "target-feature=+atomics,+bulk-memory" in order to get the target program to get as far as it does in the compilation process.

I then built the compiler again with the x.py script, and (unfortunately?) it built without problems. An error might have given something to look at next. So then I tried to see if the setting change "took" by invoking:

rustc -Z unstable-options --target=wasm32-unknown-emscripten --print target-spec-json

...using the new executable that ended up in my:

rust/build/x86_64-unknown-linux-gnu/stage1/bin

...directory. Now the "singlethread:" option is missing from the report, which I take as a good sign. Anyway, when then trying use the new compiler to build a simple threaded program, I get link errors about thread local storage (same as with the stock compiler) with wasm-ld complaining about:

relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol 'std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY::h776cf75763f0fad1

...etc.. "wasm-base" has a section in TargetOptions:

        // When the atomics feature is activated then these two keys matter,
        // otherwise they're basically ignored by the standard library. In this
        // mode, however, the `#[thread_local]` attribute works (i.e.
        // `has_elf_tls`) and we need to get it to work by specifying
        // `local-exec` as that's all that's implemented in LLVM today for wasm.

        has_elf_tls: true,
        tls_model: TlsModel::LocalExec,

...which doesn't strike me as obviously incorrect. There is more information about the linker problem described here.. If you have any advice as to documentation to read, or pointers of where to start looking next, or a better forum or list to post to, I'd appreciate it.

Thanks!

1 Like

Interestingly enough, the asmjs-unknown-emscripten target fails with the exact same linker message. I see that the sys/unix/threads.rs already has some configurations specific to emscripten. I wonder if threads were working at one time, and there was a regression that broke things.

Over in the compiler code gen, I see the function set_thread_local_mode(). It looks like this has exactly one caller, over at compiler/rustc_codegen_llvm/src/consts.rs. The comment seems to be not exactly the same case as the link error described further up thread, but maybe close, especially if this is the only place the "right thing" can happen. Assuming of course that the issue lies with something in rust and not a bug in the downstream part of the toolchain.

    // Thread-local statics in some other crate need to *always* be linked
    // against in a thread-local fashion, so we need to be sure to apply the
    // thread-local attribute locally if it was present remotely. If we
    // don't do this then linker errors can be generated where the linker
    // complains that one object files has a thread local version of the
    // symbol and another one doesn't.
    if fn_attrs.flags.contains(CodegenFnAttrFlags::THREAD_LOCAL) {
        llvm::set_thread_local_mode(g, self.tls_model);
    }

...I've also macro expanded thread_local! in a simple program compiled against wasm32-unknown-emscripten

    use std::cell::Cell;
    use std::thread;

    thread_local! { static VAR1: Cell<i32> = Cell::new(1); }

    fn xs() {
        for _ in 0 .. 10 { println!("X"); }
        println!("VAR1 in thread: {}",VAR1.with(|v| {v.get()}));
    }

    fn main() {

        println!("VAR1 in main before: {}",VAR1.with(|v| {v.get()}));

        let t1 = thread::spawn(xs);
        VAR1.with(|v| {v.set(2)});
        println!("VAR1 in main after: {}",VAR1.with(|v| {v.get()}));

        t1.join().unwrap();
    }

...where the expansion of note looks like:

const VAR1: ::std::thread::LocalKey<Cell<i32>> =
    {
        #[inline]
        fn __init() -> Cell<i32> { Cell::new(1) }
        #[inline]
        unsafe fn __getit() -> ::std::option::Option<&'static Cell<i32>> {
            #[thread_local]
            #[cfg(all(target_thread_local,
                      not(all(target_family = "wasm",
                              not(target_feature = "atomics"))),))]
            static __KEY: ::std::thread::__FastLocalKeyInner<Cell<i32>> =
                ::std::thread::__FastLocalKeyInner::new();

            #[allow(unused_unsafe)]
            unsafe { __KEY.get(__init) }
        }
        unsafe { ::std::thread::LocalKey::new(__getit) }
    };

...which I am interpreting as relying on the #[thread_local] instrinsic (as opposed to a few other options in std/src/thread/local.rs.

Hi, so is there much point really? If atomics are not supported?

My understanding is that wasm does support atomics. The overall goal is to support threads, which doesn't seem like it should be an insurmountable task, considering that emscripten supports pthreads. Here is an example in C of atomics working with emscripten. Note the difference sums for the atomic vs. non-atomic variable.

/*
compile with:
    emcc example3.c -o example3_js -pthread -s PROXY_TO_PTHREAD -s ASYNCIFY
run:
    node --experimental-wasm-threads --experimental-wasm-bulk-memory example3_js
example output:
    foo_atomic: 20000000, bar_non: 15482599
*/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>  
#include <threads.h>
#include <stdatomic.h>

#define ITER_LIM 10000000

atomic_int foo_atomic = 0;
int bar_non = 0;

void *test(void *id)
{
    for(int i = 0; i<ITER_LIM; ++i)
    {
        ++foo_atomic;
        ++bar_non;
    }
    thrd_exit(EXIT_SUCCESS);
}
   
int main()
{
    thrd_t thread_id;
    thrd_create(&thread_id,(thrd_start_t)test,NULL);

    for(int i = 0; i<ITER_LIM; ++i)
    {
        ++foo_atomic;
        ++bar_non;
    }

    thrd_join(thread_id,NULL);

    printf("foo_atomic: %d, bar_non: %d\n",foo_atomic, bar_non);
    return EXIT_SUCCESS;
}

...and the assembly output includes:

i32.atomic.rmw.add	foo_atomic

...so I believe that comment about atomics not being supported in wasm may be outdated.

Just another little update on this issue. I created a small example which doesn't call threads from Rust; instead it links to a C library which does create a thread, and prints out some status to the console. This works as a native x86_64 program on Linux, but has the same wasm-ld link errors when trying with wasm32-unknown-emscripten (not too surprising):

relocation R_WASM_MEMORY_ADDR_TLS_SLEB cannot be used against non-TLS symbol `std::sys_common::thread_info::THREAD_INFO::__getit::STATE::h23a43961a5b1f8ea

...complaining about various symbols not being part of thread local storage. Which confirms that the error doesn't just depend on starting threads in the Rust side of things (but of course needs to link against thread safe libraries). This example only starts and uses threads on the C side.

There is also some interesting information on this comment about the same link errors in the Emscripten issue tracker.