Synchronized FFI access to POSIX environment variable functions

cgwalters · January 3, 2022, 8:18pm

I think that logic may have come from Make putenv() fully conforms to Open Group specs Issue 6 · freebsd/freebsd-src@d851040 · GitHub which is about supporting putenv allowing callers to mutate the string later.

But there are two allocations here; the pointer array and the individual strings. Much like a Vec<String>. At a quick look at the source code it does realloc the pointer array, which means a concurrent getenv() can still read a freed pointer array.

parasyte · January 5, 2022, 3:46am

Much of the discussion mentioned making set_env an unsafe method, but I don't recall seeing anyone recommend making get_env unsafe as well. I am just not convinced that making the write-side unsafe is enough. This would be akin to making read-only access to static mut safe, which is also not a suggestion I've ever seen. Doing that would make static mut even worse than it already is ^[1].

One can imagine several situations like the following:

An application is written with unsafe set_env() and it ensures the invariants are upheld to call this function safely.
In a future change, safe code is added before the unsafe invocation that somewhere in the call stack (perhaps in another crate) spawns a thread that calls into ffi that uses getenv.
We've now introduced UB just by adding safe code that violates the invariants of unsafe code somewhere else in the application.

This seems quite concerning to me. I can't think of any other cases in Rust that have this "safe to use unless unsafe is included anywhere at all" behavior. I feel this is mostly because verifiably sound use of unsafe uses encapsulation to prevent this kind of spooky action at a distance.

Am I wrong? Is this just the expected nature of the beast? Are there other situations where this kind of spooky action happens commonly?

What I'm getting at is that I can't see a way to provide a safe abstraction for getenv, just as I can't see a way to provide safe read-only access to static mut T.

Consider deprecation of UB-happy static mut · Issue #53639 · rust-lang/rust (github.com) ↩︎

carbotaniuman · January 5, 2022, 6:40am

You touch on an interesting point, and I think it's really just the design of the environment API. I think the only equal would be the other single-threaded POSIX stuff like forks/pre_exec, and command API (?).

The intended use case (ignoring any possible changes) for environment variables is as inheritable config that should only be set at the start of main. There should be no spooky action at a distance here because there's should be nothing before a set_env. Basically, and this is the way it is in C too, if your library calls setenv at any point when other threads are doing stuff, it has UB.

Of course, enforcing that is another matter. Async runtimes that spawn threads may be workable depending on the exact implementation (which is nontrivial to get right) and stuff like backtraces and logging will likely be fine. Many patterns will work, but there are definitely those that have come to rely on thread-safe env setting. For instance, the test runner and proc macros probably need a deeper look.

This is getting rambly, but I think between the 5? GH issues and this thread we've discussed a lot. I do wonder if a project or group or something that could push some of the non-controversial changes out and do some experimentation on new APIs/changes to core stuff might be beneficial.

bascule · January 5, 2022, 5:25pm

In this regard, I had an idea for a hypothetical new libc API which could potentially address this which I'm not sure has been considered yet: adding a freezeenv() function which makes the environment immutable:

@richfelker as long as we're spitballing about hypothetical new libc features, how about a feature that opts-in to actually enforcing this, i.e. a way to freezeenv the environment within a particular process such that calls to setenv / unsetenv return an error at the libc level?

If that were possible, it seems like the safety errors around getenv would just go away without the need for synchronization (besides a process-wide "is environment frozen?" atomic flag).

Of course, instead the setenv / unsetenv callers would have to deal with a runtime error, but based on the above conversation it sounds like there's no truly sound solution to fix these safety issues without ensuring that the callers simply don't use these functions inside of programs where multiple threads access the environment.

From this thread:

github.com/rust-lang/rust

set_var/remove_var are unsound in combination with C code that reads the environment

opened 10:24PM - 23 Aug 15 UTC

bluss

A-concurrency P-medium I-unsound C-bug E-needs-mcve T-libs O-unix

Like the documentation for our `set_var` (setenv) says, care must be taken with …mutating environment variables in a multithreaded program. See [this glibc bug #13271](https://sourceware.org/bugzilla/show_bug.cgi?id=13271) that says getaddrinfo may call getenv. It looks like we have an unsynchronized call to getaddrinfo and this may cause trouble with glibc/Linux. Seeing glibc's attitude to setenv in multithreaded programs, `set_var` seems like a big hazard in general(?). Discovered as an issue tangential to #27966 cc @alexcrichton

bascule · January 5, 2022, 5:52pm

This is unfortunately a problem that deliberately resists such encapsulation, because it would require some sort of agreed upon locking convention or new libc features.

That said, I have definitely seen these sorts of issues making FFI bindings to libraries, especially when trying to add some sort of locking model to a library which isn't designed to be thread safe, and exposing unsafe power-user functions that require awareness of that locking model and that its invariants are upheld.

I think get_env is safe in that if set_env (or a hopefully-soon-to-be-introduced-replacement) were unsafe, there is no way to make get_env misbehave from safe code.

While encapsulating unsafe code is something Rust tries to make as easy as possible and aspires to, the real source of the unsafety is the unsafe function invocation, and if subsequent introduction of get_env violates the safety invariants which are hopefully documented at the unsafe call site, then greater care should be taken to ensure those invariants are upheld and aren't broken by future changes, or the overall program needs better encapsulation of environment access in general. Perhaps clippy lints could be helpful here?

Or, ideally, new synchronization libraries or future libc APIs can address this problem in perpetuity.

mathstuf · January 5, 2022, 6:17pm

Other non-Rust code on other threads could manipulate the environment. For example, env::var could have validated some portion of the value as UTF-8, get interrupted and another thread go and make some of the already-validated part not UTF-8. Upon rescheduling, it finishes validating the rest of the value and returns non-UTF-8 in a String. How is this situation going to be encapsulated in a safe env::var call?

bascule · January 5, 2022, 6:38pm

That code is invoked using unsafe

mathstuf · January 5, 2022, 7:19pm

Not necessarily. Rust may have been dynamically loaded. Think a Python module written in Rust. There is no unsafe Rust code, but env::var is still a hazard.

bascule · January 5, 2022, 7:36pm

I think we're getting in the weeds here. That use case requires #[no_mangle], which doesn't literally use the unsafe keyword but still very much has unsafe behavior, and it's one of many such attributes with unsafe behaviors. I would not consider code using such attributes to be "safe".

In fact, you can't author such a program with #![forbid(unsafe_code)], as there's now a lint which would catch it.

When people talk about "safe Rust" programs it's generally in the context of programs written 100% in the safe subset of Rust (stdlib aside), not some polyglot project where a C program is calling Rust or a Rust program is calling C via FFI. Clearly all sorts of things can go wrong in such programs.

In the context of programs written in 100% safe Rust, where the only unsafe code resides in core/std/etc, get_env is safe to use (or would be if set_var and remove_var were made unsafe, and therefore couldn't be used by definition in such programs).

mathstuf · January 5, 2022, 8:07pm

I was not aware that "unsafe only when non-Rust code is involved" versus "safe when only Rust code is involved" was a distinction that was made. Where is this rule that allows such a distinction written down? Is that missing from the Rustonomicon?

Even if it is a rule, I don't see how I can actually do anything based on it. How can I mark my crate as "not allowed to be used by an unsafe-using crate"? How about "not dynamically loaded"?

The Rustonomicon states:

If env::var is safe, this guarantee is not sustainable, so which do we lose?

bascule · January 5, 2022, 8:21pm

If you believe get_var is unsafe, can you write a self-contained Rust program annotated with a #![forbid(unsafe_code)] attribute which demonstrates the unsafety, without using set_var or remove_var? (which we mostly agree should be marked unsafe)

mjbshaw · January 5, 2022, 8:28pm

If we're considering multi-language contexts, then I don't think the claim* "If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety" is tenable. After all, the following code uses only safe Rust, but we can still trigger UB:

#[export_name = "set_to_zero"]
pub extern "C" fn set_to_zero(var: &mut i32) { *var = 0; }

If non-Rust code invokes this with a null pointer, we have UB.

IMO it's fine for Rust's env::get_var to be safe. The guarantee you quote is never sustainable when mixing languages.

_{*I interpret "If all you do is write Safe Rust" to mean that mixed-language scenarios are excluded from any safety guarantees. The Rust code is safe from the Rust perspective.}

bascule · January 5, 2022, 8:32pm

Yes, and also note that that program is rejected under the #![forbid(unsafe_code)] lint due to the use of unsafe attributes:

error: declaration of a function with `export_name`
 --> src/lib.rs:3:1
  |
3 | #[export_name = "set_to_zero"]
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
note: the lint level is defined here
 --> src/lib.rs:1:11
  |
1 | #![forbid(unsafe_code)]
  |           ^^^^^^^^^^^
  = note: the linker's behavior with multiple libraries exporting duplicate symbol names is undefined and Rust cannot provide guarantees when you manually override them

mjbshaw · January 5, 2022, 8:39pm

I'm afraid this might be getting off track, so I intend for this to be my last post regarding this, but even #![forbid(unsafe_code)] cannot guarantee no UB when mixing languages. If we remove the attribute on the function and instead have the C code call the mangled function name, we still have the same issue despite #![forbid(unsafe_code)] not complaining.

parasyte · January 6, 2022, 9:07am

The issue I brought up is one that is known: Composition of unsafe abstractions.

For instance, I can use #!forbid(unsafe_code) in my crate, but I depend on one or more crates that use unsafe code. I can still invoke UB in my crate if such unsafe abstractions do not play nicely together. This is what I was trying to define, but I think the blog post (and this old thread) do a better job than I could.

kornel · January 6, 2022, 11:35am

You could always have some foreign code in your process that sprays memory with 0[rand()] = rand(), or even a completely external process writing to your /proc/mem. Consideration of such impossible to prevent problems is simply unhelpful for evaluating (un)safety that Rust can reasonably control. Let's agree that Rust can't prevent all UB in all edge cases, but move on to evaluating what UB Rust can prevent in common situations, and with #!forbid(unsafe_code).

mathstuf · January 6, 2022, 12:56pm

Sure, I think silly things like the examples given are outside of Rust's control and guarantees (just like "power off" defeats any linear type guarantees that may exist), but the environment is, fundamentally, a static mut at heart and access is not through a Sync access, so I still see this as well within Rust's guarantees because telling other code "you can't use setenv" is very different than "you can't use a random value to a pointer".

Basically, I'd like stronger reasoning for this to be ignorable than "the environment is convenient" or "well, it's usually OK" at least. Gobs of documentation on env::var on why it's safe, but should be avoided in threaded contexts and please give proper APIs for code to use instead of mutable global state. I guess this discussion just reminds me of the hopes-and-prayers model to dealing with such situations in C and C++, that's all.

Basically, there's no // SAFETY comment that can be placed inside of env::var that appeases the *environ dereference that is going on in there.

bascule · January 6, 2022, 1:46pm

Sure, I think silly things like the examples given are outside of Rust's control and guarantees (just like "power off" defeats any linear type guarantees that may exist), but the environment is, fundamentally, a static mut at heart and access is not through a Sync access

Within the safe subset of Rust, access is synchronized through a StaticRWLock.

so I still see this as well within Rust's guarantees because telling other code "you can't use setenv " is very different than "you can't use a random value to a pointer".

The presumed fix here is to make set_var and remove_var unsafe, which would make them uncallable from the safe subset of Rust.

All the scenarios you provided are outside of the safe subset of Rust.

As I asked earlier, if you think there is unsoundness from the safe subset of Rust alone, can you give a concrete example? You're continuing to imply there is, but can't provide a PoC.

carbotaniuman · January 6, 2022, 3:39pm

I think the unsafe code thing is sort of a digression here. The fact that the environment is like a static mut is fundamentally part of the reason this issue is here in the first place.

Since multi-threaded applications are not allowed to use the environ variable to access or modify any environment variable while any other thread is concurrently modifying any environment variable, any function dependent on any environment variable is not thread-safe if another thread is modifying the environment; see XSH exec.

You're right to point out that this is a problem, and POSIX "fixes" this by making it UB to call setenv when there are multiple threads, except for some rare conditions. Yes, this means the contract of setenv is basically don't race with getenv, but it's whats we have.

As an aside, one of the reasons static mut is prone to UB is because it interacts with Rust references and it's aliasing model. environ does not have to deal with this.

RalfJung · January 7, 2022, 4:55pm

My answer to your (very valid!) concern would basically be: no, it is not. It is (or at least, it should be) a static at heart, and hence reading it is fine. setenv and friends are distractions from a past when people thought the environment was mutable. They are like taking the &Environment and then doing some crude casts to mutate it.

Sadly we still partially live in that past but this thread is about moving away from it.

Topic		Replies	Views
Function to hold lock on execution environment (for FFI) libs	26	2621	April 4, 2021
Thread-safe environment variable mutation	2	1146	February 20, 2023
Pre-RFC: deprecate env::set_var libs	21	4010	March 25, 2019
Standard library, synchronization primitives, and undefined behavior libs	24	9678	April 7, 2019
Pre-RFC: Mark all APIs that allow access to arbitrary files as unsafe language design	5	3804	March 25, 2019

Synchronized FFI access to POSIX environment variable functions

Related Topics