Why should std::env::var panic?

Why restrict to C functions and *NIX conventions? All of this depends on the platform. A system where environment variables are defined as HashMap<String, String> or something similar would allow both \0 and = to appear in the name, which is why I think rejecting strings containing \0 or = would be undesirable and sometimes incorrect.

I agree with both points, but I disagree that our current limited API ought to be that flexible; and I agree with the OP that a panic! is not great.

Thus, to me, it looks like two things ought to happen:

  • our current APIs ought to return NotPresent or something along those line for invalid keys and whatnot;

  • a more precise / fine-grained API could be added: try_{set_}var, with that outer layer of Result wrapping that @Nemo157 mentioned (or an outer layer of Error wrapping).

4 Likes

Is anyone here suggesting that it can't be a platform-specific check? Unix certainly doesn't support = or \0. Windows probably has other rules too.

2 Likes

Windows is almost the same except the first character of the key name can be =. However, this is very strongly discouraged (indeed the docs make no mention of it) and it's only used by cmd.exe as part of a legacy hack to implement per-drive current directories. E.g. =C:=C:\Users\Me is parsed as ["=C:", "C:\Users\Me"].

So it's technically an implementation detail to support the legacy command-line interpreter, albeit one that's unlikely to change due to Windows' strong userspace guarantees.

The other bit of Windows weirdness is that the key name HELLO is equal to hello. Or to be more specific, Windows environment keys are case insensitive but case preserving. However, I do not think this affects the topic at hand.

The original request in this thread was to make env::var(_os) not panic. Can we agree that no matter what the rules for valid environment variable names actually are, env::var(_os) should return NotPresent instead of panicking when its argument is an invalid name? I haven't seen anyone arguing against this idea here.

13 Likes

It seems like everyone here agrees? So is the next step to make a PR and ask the libs team to sign off on it?

2 Likes

I'm working on drawing up that PR, since I had already claimed the relevant issue.

Edit: Currently, I'm only dealing with var and var_os. It seems like there's the most consensus for those, and we haven't fully worked out what to do with the others yet. Please let me know if this is a problem.

I discovered a similar issue with std::process::Command env handling:

This might be fixed along with this in a similar way (that would be least surprising for the user)

A bit late now, in this thread and for the library design as well. But wouldn't the correct/rustic approach to have a distinct type for the key (and possibly for the value too) and then provide some conversion/coercion from/into OsString and String?

Not necessarily. There are a few reasons, but the main one is that there's no clear reason why a key that couldn't exist should be treated differently from a key that simply doesn't exist.

Another use of an EnvKey type would be for comparing keys for equality on Windows.

When you really consider doing a makeover to get this straight, then I'd suggest to copy over all 'char **environ;' block access/construction functionality from std::process:: to here, fix it/give it a sane API and let later let std::process use std::env for that. That's far better than having pretty much the same thing in 2 places.

Gosh, wouldn't that be nice? Only it would break FFI calls to libraries that use the C getenv (or access environ directly). std::env::* need to continue maintaining the environment block in the legacy format.

1 Like

I meant copy over the functionality (and code perhaps), not maintaining an isolated environ block in rust (except for process construction purposes, see std::proccess). Basically the same legacy stuff as usual/C just with a sane foolproof API. It should not break C code that uses getenv/setenv/putenv and whatever else is there. (EDIT: env vars in windows are stored in the registry? what limitation does that put on the value side, isn't binary including nul bytes allowed there?)

Given the thread unsafety of these calls, is Rust actually locked into using the C calls in case there's some mutex under the hood serializing access to environ for FFI purposes? Having Rust twiddle with environ while C is doing the same in another thread gives me the heebie jeebies :slight_smile: .

I think there's an open issues about set_var being thread unsafe on Linux (iirc, glibc in particular expects environment variables not to change while it's doing anything).

I found this issue which seems related (getaddrinfo can call getenv and the solution being mentioned on the glibc side of "once there are multiple threads, don't mutate the environment"). There's probably a deep rabbit hole there for anyone curious.

1 Like

I happen to have been down this particular rabbithole before :wink:

The fundamental problem here is the environ variable (and, equivalently, the third argument to C main), which are directly accessible and modifiable by any program, without any obligation to take a lock — in fact, there's no lock that they can take. Moreover, getenv doesn't copy the string it returns, so any locking that getenv/setenv/putenv/unsetenv do internally is useless to code that needs the value of an environment variable to not change while it's halfway through parsing it.

Because of this, glibc in particular throws up its hands, declares that it's not safe to modify the environment once there are multiple threads, and doesn't bother doing any locking in getenv. There is internal locking in setenv/putenv/unsetenv but it only protects the environ array itself from corruption.

I don't think it's feasible to fix this in the C library — we could expose the internal lock, but it would still be advisory and it would still protect only the actual lookup, not the value of the variable. std::env does take a copy, but in an FFI context that only narrows the race window, it doesn't eliminate it altogether.

1 Like

Previous discussion of setenv/getenv thread safety:

1 Like