I am working on the Rust support for Sulong, a high-performance LLVM bitcode interpreter written in Java running on Linux or Mac OS X on x86 64-bit systems (Sulong on GitHub). Sulong can already run some of the benchmarks from The Computer Language Benchmarks Game like certain versions of fasta, nbody, pidigits and a singlethreaded fannkuch (Sulong does not yet support multithreading).
However, Sulong is struggling with lang_start
/lang_start_internal
. On Ubuntu 64-bit, for example, calls to libstd are supported via native calls to libstd-*.so. For some reason, once a native call to lang_start_internal
is executed and the callback to the main function defined in bitcode happens, Sulong definitely aborts the execution of main and exits with exit code 245 (SIGSEGV) after some time (exact point in time varies from execution to execution). For the long term, the idea is to run the whole bitcode version of libstd and its dependencies on Sulong, because we cannot do optimizations like inlining across native boundaries. This will most likely require us to come up with a customized libstd anyway. In the course of this, we will hopefully be able to also get rid of whatever in lang_start
is responsible for crashing Sulong.
The currently implemented workaround for mentioned problem is to intrinsify the call to lang_start_internal
. To tell the truth, all said intrinsic does is calling the given main-closure, which is primarily responsible for executing the call to the actual main function. When I executed a Rust program which relies on libstd with that intrinsic enabled for the first time, I expected it to crash as soon as the first call to __rust_alloc
happens since I thought that lang_start
would be responsible for some essential initialization work regarding the heap. To my surprise, this simple intrinsic apparently does not cause any such problems and even enables Sulong to run complex benchmarks. I am aware that unwinding and backtrace support gets lost by dodging lang_start
. I am also aware that command line args will not be stored in that “squirreled away location” and therefore env::args
and env::args_os
will not be able to retrieve them, but this in particular should not be a big deal, because Sulong can provide its own builtins for reading command line args. I am unsure if missing stack guards pose a problem.
My questions concerning this matter:
-
Am I overlooking any hidden problems (undefined behaviour etc.) related to skipping the execution of
lang_start_internal
through the current workaround? -
If there are problems related to it, can these problems be mitigated by doing some manual initialization work outside
lang_start
like defining a custom global allocator? -
Is there a way to initialize backtrace and unwinding support manually outside
lang_start
(e.g. usingpanic::catch_unwind
and Co.) -
Am I missing some in-between thing between
no_std
-Rust and Rust with libstd which would allow a Rust program to use libstd without limitations, but at the same time offers a way to replacelang_start
with some sort of customized runtime initialization? -
Is there any information on how the
rt
module andlang_start
in particular will evolve in the future? Will it ever vanish completely like the rest of the early Rust runtime?
It would be nice if a sophisticated version of the current workaround could be established as a solid alternative to a custom libstd and no_std
in the long run.
Crossposted to: https://www.reddit.com/r/rust/comments/8o9w42/sulong_an_llvm_bitcode_interpreter_written_in/