I am working on the Rust support for Sulong, a high-performance LLVM bitcode interpreter written in Java running on Linux or Mac OS X on x86 64-bit systems (Sulong on GitHub). Sulong can already run some of the benchmarks from The Computer Language Benchmarks Game like certain versions of fasta, nbody, pidigits and a singlethreaded fannkuch (Sulong does not yet support multithreading).
However, Sulong is struggling with
lang_start_internal. On Ubuntu 64-bit, for example, calls to libstd are supported via native calls to libstd-*.so. For some reason, once a native call to
lang_start_internal is executed and the callback to the main function defined in bitcode happens, Sulong definitely aborts the execution of main and exits with exit code 245 (SIGSEGV) after some time (exact point in time varies from execution to execution). For the long term, the idea is to run the whole bitcode version of libstd and its dependencies on Sulong, because we cannot do optimizations like inlining across native boundaries. This will most likely require us to come up with a customized libstd anyway. In the course of this, we will hopefully be able to also get rid of whatever in
lang_start is responsible for crashing Sulong.
The currently implemented workaround for mentioned problem is to intrinsify the call to
lang_start_internal. To tell the truth, all said intrinsic does is calling the given main-closure, which is primarily responsible for executing the call to the actual main function. When I executed a Rust program which relies on libstd with that intrinsic enabled for the first time, I expected it to crash as soon as the first call to
__rust_alloc happens since I thought that
lang_start would be responsible for some essential initialization work regarding the heap. To my surprise, this simple intrinsic apparently does not cause any such problems and even enables Sulong to run complex benchmarks. I am aware that unwinding and backtrace support gets lost by dodging
lang_start. I am also aware that command line args will not be stored in that “squirreled away location” and therefore
env::args_os will not be able to retrieve them, but this in particular should not be a big deal, because Sulong can provide its own builtins for reading command line args. I am unsure if missing stack guards pose a problem.
My questions concerning this matter:
Am I overlooking any hidden problems (undefined behaviour etc.) related to skipping the execution of
lang_start_internalthrough the current workaround?
If there are problems related to it, can these problems be mitigated by doing some manual initialization work outside
lang_startlike defining a custom global allocator?
Is there a way to initialize backtrace and unwinding support manually outside
Am I missing some in-between thing between
no_std-Rust and Rust with libstd which would allow a Rust program to use libstd without limitations, but at the same time offers a way to replace
lang_startwith some sort of customized runtime initialization?
Is there any information on how the
lang_startin particular will evolve in the future? Will it ever vanish completely like the rest of the early Rust runtime?
It would be nice if a sophisticated version of the current workaround could be established as a solid alternative to a custom libstd and
no_std in the long run.