Synchronized FFI access to POSIX environment variable functions

@Nokel81 It's mentioned here.

@carbotaniuman I do believe File::open("/proc/self/mem") is pretty suspicious for most people, but std::env::set_var() at the middle of the code isn't without much context.

I would advocate for a solution that allows the current safe std::env::set_var to keep working in all idiomatic usage of the API while providing an unsafe std::env::set_var_unchecked for those who for some reason need it.

This solution is quite simple: in the implementation of std::thread we add a single global AtomicBool, which by default is set to false. This boolean is set to true when any thread is spawned by the std::thread APIs.

In the implementation of std::env::set_var, we panic if this boolean has been set, indicating threads might be in use. It is then also easy to add a std::env::try_set_var function, although it might not be necessary to begin with.

To facilitate interoperability with other languages, we additionally add to std::thread the following functions (caveat bikeshedding on the naming):

  • fn set_threads_spawned(): allows writers of safe wrappers around C libraries and the like that internally spawn threads to indicate so. This sets the flag internally.
  • fn get_threads_spawned() -> bool: allows user code to query if the flag is set.
  • unsafe fn reset_threads_spawned(): unsafe function that allows resetting the flag. Should only be used when no other threads are truly active, and re-allows use of the std::env::try_set_var API (and any other MT-unsafe APIs). This function is not necessary for this approach to work, but will allow any other sound usages of the setenv API that are disallowed by the restrictiveness of this proposal.

Due to the nature of the problem, all operations on the AtomicBool can actually use Relaxed ordering, meaning that on basically all platforms they end up just compiling down to a simple non-atomic read/write, so performance overhead should be negligible.

The caveats of this approach is that it is slightly more restrictive than necessary (it doesn't automatically re-allow usage of the setenv API when all threads have joined, but this use case is very rare and we can provide an unsafe function that does allow it) and it doesn't automatically guard against threads that are created outside of the rust standard library. This defect is mitigated by the fact that calling into any other language that creates threads will have to happen through unsafe bindings, which using the provided APIs above can still be safely wrapped. So this approach should at least eliminate the unsoundness of safe rust code.

1 Like

I may be in the minority here but I'm not convinced that we should change this. The standard library already uses a lock for get/set, so in a pure-Rust program set is entirely safe even with multiple threads. It's only when you add C to the mix that it becomes unsafe, but then using C is always unsafe, that's why you need unsafe to call an external C function. Any C code which accesses the environment while it could be being modified is either broken or is being misused. It's the responsibility of C library authors to be clear about when/if they access the environment so that people can use the library correctly, and it's the responsibility of Rust users to make sure that they're using these libraries correctly when they write the unsafe code that binds to it. Granted, this means it's impossible to create a correct, safe wrapper library around any C library that accesses the environment, but we're stuck in this reality until at least 2027 since it'll take two editions to deprecate and then remove set.

Earlier @josh suggested that we try to push the environment lock upstream, ie. by convincing the standards bodies and libc authors to put a lock on getenv/setenv or add thread-safe alternatives to these functions. This seems like the correct solution. And once that's done there won't be any need to deprecate set anymore (since the standard library will just use the libc lock instead). So maybe we should just push hard on that instead?

Edit: Actually it is still possible to create correct safe wrapper libraries around getenv-using C libraries so long as you statically link them. You just need to override the symbols for {get,set,unset}env so that they point to versions of these functions implemented in Rust, rather than the libc versions.

3 Likes

I don't think that's a viable solution. For example, what if my Rust library is being called from a C program? There could be zero unsafe in my Rust library, and accessing std::env::set_var could still be UB because the host C program may have created threads. Rust cannot detect that, and thus cannot make safety claims.

Here is a pure-Rust program that segfaults because of this problem. The thing is that there is no such thing as pure-Rust program (maybe with the exception of no_std). The standard library itself got bitten by this problem because of getaddrinfo reading the environment. And even if std works around this by taking the ENV lock, authors of other crates that link to C do not have access to that API.

If C code only reads the environment then it not misbehaving because the standard says that setenv should not be called in multithreaded programs, or, if it is called it should be synchronized across all threads. It is Rust's assumption that there are no C spawned threads that is false.

Honestly this sounds like a fine outcome since we have tools like rust security advisories, documentation, and compiler and clippy warnings that together can nudge the majority of users to make safe use of the environment.

1 Like

Rust cannot detect that, and thus cannot make safety claims.

That is true, but is hardly the only way a program calling a rust lib can trigger something bad in the rust lib. You can do a lot of stupid stuff in C to ruin the environment for library calls. You can mprotect random memory, you can spawn threads that access any non-reentrant functions in libc and any locks on the rust side around those won't do anything, you can write to the rust stack directly from other threads, in the end if you end up calling a rust lib from C you have to ensure you don't violate the environment guarantees rust needs to provide its safety.

I submitted [dontmerge]: Change `setenv` on Linux to panic if threaded by cgwalters · Pull Request #90799 · rust-lang/rust · GitHub just to try to gauge potential fallout.

3 Likes

OK I tried to collate the discussion options so far...does this seem accurate?

Rust: setenv and threads options

try_setenv: Add a new API which errors if threaded

See try-setenv-rs. Basically we introduce pub fn try_set_env_var<K: AsRef<OsStr>, V: AsRef<OsStr>>(key: K, value: V) -> Result<(), Error>.

In this path we could choose to deprecate set_env (but only on Unix?), or not.

Pros:

  • Does not break existing working code
  • The dynamic checking makes it very likely that any race conditions would be detected at runtime

Cons:

  • Is known to fail today for apps using e.g. [tokio::main] which spawn threads; these apps would need to stop using the macro, or async runtimes like tokio would need to be refactored to only spawn threads when a future is run.
  • A lot of potential churn in the ecosystem to change to try_set_env().unwrap().

Change POSIX.1 standard to have internal locking and add getenv_r

First, see this comment from zackw.

(And get patches into glibc, musl, bionic, etc.)

Pros:

  • Fixes the problem for all languages, including plain old C

Cons:

  • See linked issue above that notes that C programs can already access environ and the 3rd argument to main directly outside of an API
  • We would have to wait a long time before this propagates far enough that we could rely on it
  • Some people on standards body already opposed this and took the "just don't setenv() with threads" stance.
  • Only fully mitigates the problem if everything in the process uses getenv_r.

Only operate on Rust-specific copy of the environment

This is what Go does: https://cs.opensource.google/go/go/+/master:src/syscall/env_unix.go;l=39?q=Setenv&ss=go%2Fgo

Pros:

  • Does not break existing working code.
  • Can be implemented without special platform capabilities (such as "check if there are other threads running") or introducing new requirements for the Rust runtime as a whole (such as informing it when C code spawns a thread).
  • Gives the ecosystem the opportunity to slowly migrate each use of set_env to one of the two new functions.
  • If the two new operations are not sufficient for some code, we will hear about that and can consider adding more options.

Cons:

  • Will actively break any code that is today calling set_env() but directly invoking e.g. fork()+execve() and relying on set_env() to propagate into the child process. For example, a container runtime in Rust might use set_var() but directly call Linux clone3() etc. These would all need to be fixed to call e.g. libc::setenv().
  • A lot of churn since every use of set_env needs to be changed. But honestly I don't think this can be avoided, those all need to be audited.
  • Having such a shadow environment could cause confusion. But Rust is not alone here, and the shadow-env-mutating variant of set_env could have a suitable name to make this clear ( set_env_rust , for example, with set_env_process for the unsafe one).
  • Provides no safe way to mutate the process environment. But I am not sure if we have consensus that this is even possible on all POSIX platforms, so possibly it makes sense to delegate a safe try_set_env_process to a user crate for those platforms where checking the existence of threads is possible. [/quote]

Note that because Go has a runtime, it already doesn't work to call fork() directly. But Rust is a systems programming language and should support it.

Mark get_var/set_var unsafe and just have everyone try to deal with it

Pros:

  • Raises visibility of the problem to all the crate authors who aren't reading this thread
  • Maybe some crates figure out how to avoid setenv

Cons:

  • This doesn't really help knowing that we've fixed the problem
5 Likes

In practice, I think the try_ variant will generate frequent errors in practice in cases where unsafe { set_var(...) } would work perfectly fine, such as at the start of async fn main in a program that uses a threaded runtime, or after fork or similar when the program is no longer threaded but Rust can't necessarily know that.

I would favor the short-term solution of marking set_var unsafe, together with the long-term solution of improving the C library interfaces to make them thread-safe.

I'm happy to spend time making the case for this. "Just don't" is not usefully actionable advice, because people do have legitimate reasons to do this. Most notably, it makes software more composable, by allowing programs and libraries to write code without worrying about other parts of the program using threads.

7 Likes

Right, we discussed this before, but I've now added this one as a Con to the try_ path.

This case however will work: Test calling try_set_env() after `fork()` · cgwalters/try-setenv-rs@da96f01 · GitHub That's the advantage of using the kernel-level "reflection" from reading /proc/self/task - we handle threads being spawned by non-Rust too. (Edit: and to be clear the reason this works is fork() is defined to change the process to being single threaded)

OK. In the end, these choices aren't actually mutually exclusive. You (and others who feel the same) should obviously feel free to campaign for setenv_r/getenv_r(). If that happens, great. What would help is some sort of positive preliminary signal from one or more people on the standards groups or libc maintainers. But I think the issue here is important enough that we shouldn't block on that approach.

2 Likes

Does this work on freebsd where I believe setenv/getenv is not threadsafe but there is no /proc by default?

2 Likes

You might also want to comment on this issue, which contains a similar summary but without the try_setenv option:

I feel like I've said this several times already but: it is not possible to define a locking scheme internal to the existing set of POSIX environment variable access APIs that will make everything fully safe. This is primarily because environ and the third argument to C main allow applications to bypass whatever locking may exist internally, and secondarily because getenv doesn't copy the string it returns (I presume that's what getenv_r is meant to address).

To fix the problem "properly", the C library would have to expose the pthread_rwlock_t object that it uses internally, and document that all operations on environ (except for passing it to exec* functions after fork, because then you're single threaded anyway) must take the lock appropriately. Exposure of internal C library lock objects is unprecedented. I don't even know how to speculate how the committees would react.

5 Likes

I've updated that section to link to your comment and also mention that issue. That said...while I am strongly in the "add try_setenv" camp, I am doubtful that very much software uses environ or the 3rd main argument instead of calling getenv(). But still that leads to the next problem:

Right. And I've now updated the Cons section for this to note that adding getenv_r only incrementally helps because all existing code in the address space (including C libraries) would need to be updated to use it.

Anyways I know you and I are in agreement on this space, but I do want to try to accurately capture the tradeoffs of this option in the list since there are proponents of it and we should evaluate it carefully.

For what it's worth, I don't think getting thread-safe environment access into libc/POSIX is a viable strategy. We'd still have to find a solution for older libcs, and from what I've seen libc developers have consistently indicated that they consider a multithreaded process modifying its own environment to be an antipattern. A patch for this was even submitted to glibc in 2007 and rejected as a WONTFIX: 4350 – Reentrancy problem between strftime() and setenv()

If the POSIX consensus is that setenv is unsound to call in the presence of threads, then the Rust library has two options:

  • Don't call setenv in the presence of threads
    • Maintain a Rust-local replacement mapping on top of the OS env
    • Fail to set_env when the program is multithreaded
    • Other solutions
  • Be unsound when the program is multithreaded

There is no third option where we make setenv and threads, beyond changing/extending POSIX.

Because POSIX says that calling setenv in the presence of threads is unsound, we don't have to worry about C code calling setenv in a racy manner; such code is already broken. What we have to do is not call setenv in a manner that is POSIX MT-unsafe (i.e. in the presence of threads).

Disclaimer: I don't know POSIX very well. I am going off of what people have said previously in this thread.

Personally, I think the only resolution which both keeps set_env working for pure-Rust uses of the env and conforms to POSIX is to keep a process-global replacement map for get_env. Then we can provide an unsafe way to flush that replacement map to the OS, which requires the process to be single threaded, and potentially an unsafe way to directly setenv (or tell people to use libc for that).

If we want to keep Command propogating set_env, then we "just" add the global env replacement map as a prefix to the replacement map it already applies.

6 Likes

I'd like to re-raise an idea I proposed above: give the Rust standard library a mechanism which tracks — as much as possible — whether there are no other threads. (If on a given platform it is possible to introspect whether there are any threads, it can be fully sound and automatic by checking before each relevant library/system call, since no threads means nothing that can create one can intervene; on others, it requires an unsafe claim-of-no-threads by the user code.)

If POSIX says that some things are not thread-safe, and we want those operations to be not too difficult to use from Rust in a safe way, then Rust needs to have a concept of “there are currently no other threads”.

1 Like

If we are doing a fork() while there are multiple threads, wouldn't environment be already broken? Another thread might be in the middle of a setenv(), and the new process will have a broken environment, no? Or does glibc do some magic to prevent this from happening?

Now, I am wondering, wouldn't this be a problem for Command::spawn() ?

1 Like

try_set_env() won't set any env var if another thread is running, so there can be no env var setting at the point of the fork() if only try_set_env() is used.

1 Like

What happens when /proc isn't mounted, such as in a locked-down container?