Synchronized FFI access to POSIX environment variable functions

While that's true, if another process creates a thread in your process all bets are off; you no longer have control over your own address space, and the process breaking yours gets to keep all the pieces. The same applies if another process ptraces yours on Linux, or uses process_vm_writev to modify your memory.

(Also, apparently on Windows the environment functions are thread-safe.)

3 Likes

Something that hasn't been pointed out yet is that in Windows multiple versions of the C runtime library can be loaded in the same process, and it appears that each will create a separate shadow environment which is then not updated after the initial copy (although setenv will also call SetEnvironmentVariable and thus change the process environment as well).

So if, on Windows, one is setting environment variables for the purposes of having them seen by a getenv() call, then it's important to call setenv() from the same C runtime library that the library calling getenv() is linked to (if that C runtime library has already been loaded).

This constraint might be helpful to focus on a specific solution, consisting of multiple functions to set variables:

  • An unsafe fn set_var_for_getenv to set variables to be read by getenv() as well as processes spawned by Rust and non-Rust code: this would take a parameter with a CRT type that would be returned by CRT::from_library by looking up the CRT for a given library or CRT::current to use the CRT the binary is linked to and would set the variable for that CRT in addition to the process environment
  • An unsafe fn set_var_for_process function to set variables for processes spawned by Rust and non-Rust code and libraries using non-CRT facilities to read environment variables, with platform-dependent effects on getenv() calls: this calls setenv() on Linux and SetEnvironmentVariable() on Windows
  • Optionally, a safe fn set_var_for_rust function to set variables for Rust get_var and processes spawned by Rust code, with platform-dependent effects on processes spawned by non-Rust code and no effect on getenv() calls: this would use a shadow environment on Linux (which also needs to be updated by the other functions and read by get_var) and SetEnvironmentVariable() on Windows
  • Deprecating the current function and fixing all published crates that use it

A new function to get an environment variable from a specific CRT should also be added.

A single-thread-asserting variant seems problematic because there's actually quite a risk that some software on Windows might be injecting DLLs in all processes and either starting threads from the DLL's DllMain or via CreateRemoteThread (this could also be done on Linux via LD_PRELOAD, although it's likely more rare on Linux), which would break such assertions. The "copy shadow environment to the real one with pthread_atfork()" also doesn't work because it doesn't cover vfork+exec and posix_spawn.

1 Like

pthread_atfork is unreliable to the point where IIRC it's being discussed for withdrawal from POSIX.

No matter how it happens, I consider the mere existence of multiple copies of the environment [EDIT: within a single process] to be a bug — perhaps, as on Windows, a bug in the underlying OS, but a bug nonetheless. If Rust needs to grow scar tissue around OS bugs that aren't likely ever to be fixed, so be it, but let's be careful not to introduce new bugs ourselves.

2 Likes

I you still allow for per-process environments? Because the environment differing between processes is a major design feature in both nix and dos.

Bash
RUST_BACKTRACE=1 cargo run
PowerShell
$Old = $env:RUST_BACKTRACE
$env:RUST_BACKTRACE = 1
cargo run
$env:RUST_BACKTRACE = $Old

Out of interest, what happens when you link multiple libcs on unixes where you can get that to happen?

I'm not zackw but I'm pretty sure they meant "multiple copies of the environment" to be "multiple copies of the environment within the same process". Certainly different processes have different environments.

Yes, I meant "within a single process". Sorry for the confusion.

Out of interest, what happens when you link multiple libcs

I'm not aware of any Unix on which this is even possible. On glibc-using Unixes, you can get multiple copies of glibc loaded into a single process under exotic conditions, but it's a hack (an officially supported hack, but nonetheless not what you would call reliable) and very bad things will happen if the copies aren't all exactly the same version.

One of the things the hack does is make all but one copy throw away all their globals and use the primary copy's globals instead, so there's still only one heap and one environment block.

3 Likes

Bikeshed:

Is it actually practical to locate and make calls into the CRT used by some other DLL or EXE in the process? If so, are there common libraries that expect their clients to do this? I'm not a Windows expert, but I'd expect the answer to be no to at least the second question and probably the first as well. Especially since that DLL or EXE may have statically linked its CRT.

When it comes to the similar issue of different CRTs having different malloc heaps, if you as a DLL want your clients to be able to free data that you malloced, I've seen various suggested approaches. They include switching to a non-CRT allocator, or writing a wrapper for free (e.g. mylib_free) and exporting that to your clients. But having clients reach into your CRT directly is not one of the suggestions.

So perhaps Rust does not need to support this, which would (for better or worse) also simplify the API for users on other platforms.

If there is to be a set_var_for_rust, I think it should always use a shadow environment. With the way you have it, if you call set_var_for_rust then spawn a new process from C, the new process will see the environment change on Windows but not other platforms. I think that would be surprising and an obstacle to portability. The name set_var_for_rust also suggests it's affecting Rust-specific state.

1 Like

Do we have consensus that std::env::set_var should not exist in its current form at should be deprecated?

  • Yes, deprecate std::env::set_var
  • No, do not deprecate std::env::set_var

0 voters

This is, is course, not binding.


I personally think that it should be deprecated due to its unsoundness. I believe it's worthwhile to deprecate even before a replacement is agreed upon. Users interested in setting environment variables can still use libc::set_env (which is unsafe) in the meantime.

1 Like

I agree that it should be deprecated, but I do think we need the unsafe replacement stabilized at the same time, because libc::setenv is not a portable replacement (e.g. Windows). We need something that's guaranteed to affect the same environment that env::var reads.

(Ditto env::remove_var & unsetenv.)

7 Likes

Yeah, consider the poll to include std::env::remove_var — I just focused on the most common method.

I hadn't realized that libc::setenv isn't fully portable. I agree that there should be something portable prior to or at the same time as deprecation. Perhaps we could just use the existing implementation? Then all that would need to be decided is a name.

1 Like

It's practical although it needs some code: you can call GetModuleHandle/LoadLibrary and the result gives you base address of the mapped library binary, and from there you can parse the PE headers, find the imports, find the getenv import, then call GetModuleHandleEx with GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS to get the CRT DLL module handle on it and finally resolve setenv from it with GetProcAddress.

DLLs designed for Windows don't expect you to do that, but libraries designed for POSIX systems compiled for Windows with minimal porting effort may need it, although it's probably rare.

It's of course possible to not actually provide this in std but leave it for other crates, but the documentation for the variable setting functions in std should mention that you can't expect getenv() to see the variable unless you use a function that does this.

The one issue I have with just deprecating set_var without replacement is backtraces.

The behavior of std::backtrace captures is

The Backtrace::capture function might not actually capture a backtrace by default. Its behavior is governed by two environment variables:

RUST_LIB_BACKTRACE - if this is set to 0 then Backtrace::capture will never capture a backtrace. Any other value this is set to will enable Backtrace::capture.

RUST_BACKTRACE - if RUST_LIB_BACKTRACE is not set, then this variable is consulted with the same rules of RUST_LIB_BACKTRACE.

If neither of the above env vars are set, then Backtrace::capture will be disabled.

The backtrace crate doesn't use this behavior, but most common ways of capturing backtraces (i.e. anyhow and eyre) replicate this behavior even with stable backtrace.

Many applications want backtraces by default. This means setting RUST_BACKTRACE at startup if it's not already set.

If you're using std::backtrace directly you can force_capture. If you're not, and using it indirectly though e.g. anyhow, you have to use the environment as the global communication side channel.

Deprecating the ability to set environment variables effectively means deprecating the ability to capture error backtraces by default without setting env outside the process.

A similar pattern is often used for RUST_LOG, but at least that's avoidable at log initialization time by going through the programmatic configuration rather than env configuration, if the env doesn't exist.

1 Like

As I said in my most recent comment, I hadn't realized the libc equivalent was not portable; I do think there should be something portable, whether it be in stdlib or elsewhere (most likely in stdlib).

The linked method in std is nightly, so breakage is allowed (even if not technically breakage in this case). That said, why does a crate need to set RUST_LIB_BACKTRACE or similar after startup? I've never done anything like this for any error handling situation.

If a crate is using anyhow::Error, it captures backtraces according to the rules that std::backtrace uses. If an application wants to capture backtraces on error by default (that is, if neither RUST_BACKTRACE nor RUST_LIB_BACKTRACE are set by the calling process, capture backtraces (e.g. most contexts when a program is invoked)), then it must do so by setting one of the two control environment variables.

I find it quite nice to have applications capture backtraces without user intervention to turn them on.

How about this then?

  • Yes, deprecate std::env::{set_var, remove_var} and add an unsafe versions of the same functions under new names to std::env
  • No, don't do that

0 voters

Instead of envrionment variables, there should be a proper API to handle this. I don't think we should settle with "well, the env is the only way this has worked before, so it's the only way it can work". It should have a way to explicitly indicate "my choice overrides the env" and "my choice defers to the env if already set" as well since it seems that programs making that choice for users is a thing.

5 Likes

Is it really less troublesome, though?

The spec page you linked envisions that if libc wants to "maintain a separate copy of the environment in a data structure", then it could have getenv and friends check on every call whether environ has been modified, and, if so, "reinitialize based on the new environment", which "may include copying the environment strings into a new array and assigning environ to point to it".

But if we want this to all be thread-safe, what happens if C code in one thread modifies environ and then another thread (without synchronization) calls getenv?

We could make the assumption that the write to environ will be atomic in practice. But there's still no memory barrier. There's a good chance that the code wrote to the new environ array, and/or the strings it points to, immediately before writing to environ, and those writes could be reordered by the CPU or compiler.

libc could avoid CPU reordering with membarrier, but it's non-portable. We could assume compiler reordering probably won't happen in practice in this situation. Is that combination good enough? It seems iffy.

At least overwriting environ seems to be pretty rare compared to setenv, and mostly done by code that runs shortly after startup or fork, when there hopefully aren't any other threads.

2 Likes

Always have it point to a PROT_NONE page and introduce synchronization with a segfault handler? That'll only protect rust threads. But if C threads are updating the pointer without knowing what other threads are doing then it would already have been in UB-land.

No, no it isn't. I thought it might be feasible to specify a RCU protocol for it -- "if you're running multithreaded then you have to use compare-and-swap to change the pointer and also adhere to these constraints." But when I actually sat down to spec it I realized it gets so messy, considering all the other issues you mention, that it'd be better to package the implementation and say "direct modification is deprecated, and UB when running multithreaded, use this library function instead."

3 Likes

As it appears we have rough consensus on deprecating the current methods and introducing an unsafe replacement at the same time, I suppose it's bikeshedding time.

What names do we want to use for the unsafe equivalents of std::env::set_var and std::env::remove_var? My initial thought is to add sys in (as as set_sys_var), but that sort of implies that we're going to provide a shadow environment in the future.