Feature Name: Darkroom (cos it's a studio ...but you can't see anything)
The following information are from a combination of somewhat scarce docs and own research. It’s known to be incomplet and incorrekt, and it has lots of ba
d formatting
Motivation
To build a Rust program for native Windows ("MSVC toolchain"), one needs to download Microsoft C++ Built Tools which is a part of Visual Studio and therefore require a Visual Studio licence. It is currently impossible to compile Rust for native Windows without a VS licence or without embedding (embedding, not just dynamically linking) proprietary Microsoft code.
We would like to avoid this such that a VS licence is no longer required to compile Rust for Windows MSVC. We'll also potentially enable people to cross-compile for native Windows on other platforms.
Summary
Diagrams rendered in GitHub MD with text
The Rust standard library depends on several components from Microsoft, this is where we are at now (graph stolen & adopted from Microsoft's C++ STL repository, Apache License v2.0 with LLVM Exception):
This is what we need to do:
This is the end result. Windows SDK is available on its own for free with a far less restrictive licence. In any case we have to use it to interact with the OS:
Below are a summary of things that I think need to happen before we can get to the shiny future . I am very much not an expert in this topic so please point out as many inaccuracies as you can find!
Explanations
Use lld
By default Rust uses link.exe
on Windows, which is Microsoft's proprietary linker shipped with Visual Studio. But rustup
already ships rust-lld
and can be used by configuring .cargo/config
(NOT Cargo.toml
). There is already ongoing effort and progress to make LLD the default on Windows: Use lld by default on x64 msvc windows · Issue #71520 · rust-lang/rust · GitHub.
.lib
s Rust pulls in from Visual Studio
You can pass in RUSTFLAGS=--print=link-args
to see the commandline used to invoke the linker. Overall, the Rust std on Windows pulls in the following libraries:
Library | Rust std dependent | VS or Win SDK | Additional Compiletime dependencies | Runtime dependencies on top of base Windows | Imported Symbols from VS |
---|---|---|---|---|---|
advapi32.lib | std::sys::windows | Windows SDK | None | None | |
userenv.lib | std::sys::windows | Windows SDK | None | None | |
kernel32.lib | std::sys::windows | Windows SDK | None | None | |
ws2_32.lib | std::sys::windows | Windows SDK | None | None | |
bcrypt.lib | std::sys::windows | Windows SDK | None | None | |
msvcrt.lib | Through libc crate | VS | ucrt.lib & vcruntime.lib | None (Note: this is NOT the import library for msvcrt.dll) | None, the lib exports no symbols, it only contains CRT initialisation and termination code. |
ucrt.lib | Pulled in by msvcrt.lib | Windows SDK | None | UCRT (Shipped in base Windows since 10) | |
vcruntime.lib | Pulled in by msvcrt.lib | VS | None | Visual C++ Redistributable | _CxxThrowException __C_specific_handler __CxxFrameHandler3 __current_exception __current_exception_context memcmp memcpy memmove memset |
There are two libraries exclusively shipped with VS: msvcrt.lib and vcruntime.lib which we need to remove as dependencies. They are both parts of "VCRuntime" in Microsoft's terminology even though its subcomponent is also called vcruntime.
The source code of VCRuntime is shipped with VS, available under C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\[version]\crt\src\vcruntime
. But if you want to reimplement any of its functions you are strongly advised to refrain from reading it yourself because this code is "All rights reserved". The intention may have been to make it more difficult for anyone to claim that their reimplementations are clean-room because they cannot prove that they haven't seen the source code. This technique was used by a company where they printed their source code in the manuals, so anyone trying to reimplement their software would've likely seen the source code - I read this story somewhere but couldn't find it again, please comment if you know what I'm talking about!
Using compiler-builtins
for memory functions
The Rust std depends on 4 primitive memory manipulation functions memset
, memcpy
, memmove
, and memcmp
(but not memchr
because this is implemented in core). Currently the std gets them from libc crate which links against msvcrt.lib
and in turn vcruntime.lib
. These functions are also provided by the compiler-builtins crate so we could use them instead.
Note that the libc crate exposes virtually all standard libc functions, but the vast majority of them can be found in ucrt.lib
so it's really only the above 4 we need to replace.
Write our own entry point
This section is currently only applicable to executables. I haven't looked into libraries yet
The Windows entry point function is called mainCRTstartup
by default. However, this function is not generated by Rust. The first function created by rustc is simply called main
(which is not the pub fn main()
you wrote).
If you compile a Hello World program and decompile it with dumpbin /DISASM
, then you can see it calls two functions __security_init_cookie
and __scrt_common_main_seh
, which are statically linked from msvcrt.lib
.
__scrt_common_main_seh
calls main
generated by the Rust compiler, which then calls lang_start
-> some Rust panic wrappers -> crate::main
. Prior to calling main, the entry point needs to initialise all the global objects if no one else has done it (? who else can do it). It walks through a list of function pointers of C and C++ initialisers and calls them. The function pointers can be located through linker subsections. It also initialises TLS variables. Interestingly the global object initialisation is synchronised using a pointer-sized variable __scrt_native_startup_lock
compiled into msvcrt.lib
, but I don't think this is ever shared with another process because msvcrt.lib
is always statically linked.
We need to write our own mainCRTstartup
(we could call it something else too). I don't think we need any of the global object initialisation stuff so unless UCRT needs it then it should be fairly bare bones.
GS cookies
__security_init_cookie
unsurprisingly initialises the security cookie, which is used by Windows to detect buffer overruns. The source of this is shipped with VS but proprietary, but all it does is setting the global variable __security_cookie
to a random value if it hasn't been initialised (it hasn't been initialised if its value is the default which are 3141592653589793241 >> 16
on x86_64 or 3141592654 on x86), and the new random value must not equal to the default. Another variable, __security_cookie_complement
is set to the bit-wise complement of __security_cookie
.
Rest of the : unwind our own stack
The other functions Rust std depends from vcruntime.lib are to do with exception handling, specifically Structured Exception Handling, which is a Microsoft C/C++ extension. Rust does not have exceptions, of course, but we do need to unwind the stack during a panic. I knew very little about stack unwinding but from the comments in std source code I think that we are calling _CxxThrowException
on panic, Windows will then unwind our stack until it reaches Rust's landing pad.
We have two choices here: either ditch SEH entirely and implement our own stack unwinding mechanisms, or keep letting Windows unwind our stacks but clean-room reimplement the functions that triggers and catches the exceptions.
Alternatives
Ask Microsoft to release VCRuntime under a non-restrictive licence so we can compile & ship our own msvcrt.lib
and vcruntime.lib
Unresolved questions
- Do we need to do any SEH setup for UCRT or does it do it on its own?
- How much CRT startup and termination do we need to do for UCRT, or does
ucrt.dll
handles it on its own? - How does Go deal with interacting with UCRT? Or does it avoid it entirely?