Given some code which is being called as an extension from some other language runtime, what library functions are expected to work as documented? That is, main is in some other language, and the first call to Rust is well after the initial process startup.
Specifically, it looks like std::env::args returns nothing if Rust’s main hasn’t been called. Is that expected? Is there any documentation to that effect?
What other functions change behaviour when used this way?
I would look to rust’s startup code to see things which happen before main().
So the stack guard is another – though at least on Linux, you’ll still have the OS guard, but it will look like a raw SIGSEGV on overflow. You’ll also miss the Thread init, which mostly just means that std::thread::current().name() will be None. I think both of these are also true of threads created outside of std::thread, e.g. with direct pthread_create.
Would it be reasonable to add public functions to initialize and tear down the standard library, for use when main is not written in Rust? The initializer could take the C argv as an argument.
Rust’s standard library should not require “special” initialization in order to be happy with getting called as a library; down that path lies languages with runtimes.
The thread name doesn’t seem like a problem at all; even threads spawned by Rust don’t have names by default.
On some platforms we can potentially get arguments after the program starts, and on others we can’t; as long as we don’t crash on any platform, that seems fine. Having a function to set the arguments won’t work on all platforms (if you want the concept of changed arguments to apply to code not written in Rust). This doesn’t seem like a problem to me; if you want to handle arguments, you need to get them from the code implementing main(), even if written in another language. Perhaps we should have a convenient way of constructing an Args that way.
Regarding the stack guard initialization: as far as I can tell, that only exists to set up a signal handler to handle stack overflow and produce a friendlier error message. That’s a rather unexpected behavior, and programs that expect to handle SIGSEGV (or platform equivalent) themselves may find it surprising. I’d argue that we should document that better than we do, and ideally not do that by default. Could we have a compile-time option to select that behavior?
Finally, sys::init() itself does very little: on unix-like platforms it sets SIGPIPE to ignore (also not documented anywhere as far as I can tell), and on every other platform it does nothing.
So, overall, I don’t think there’s anything necessary in that initialization, and what is there we should either document or consider carefully removing or making optional.
That’s a glibc-private symbol that we shouldn’t use. And it can become inaccurate if your program changes where the argument area points to.
In an ideal world, I’d love to have a prctl to get the values settable by PR_SET_MM_ARG_START and PR_SET_MM_ARG_END. I don’t know of any means of doing so (other than reading /proc/self/cmdline, which depends on a mounted /proc, which we can’t count on either).
Where do you see that documented? I don’t see that anywhere in the documentation of __attribute__((constructor)). (Using that also seems likely to produce surprising behavior in some use cases and with some toolchains/linking.)
The argv crate relies on static constructors being passed argc/argv/envp on Linux/MacOS. It doesn’t seem to be specifically defined by an ABI; glibc does pass these explicitly, but other environments may not (discussion: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223752).