Which std functions are available/work without calling main()

jsgf · June 24, 2019, 11:34pm

Given some code which is being called as an extension from some other language runtime, what library functions are expected to work as documented? That is, main is in some other language, and the first call to Rust is well after the initial process startup.

Specifically, it looks like std::env::args returns nothing if Rust’s main hasn’t been called. Is that expected? Is there any documentation to that effect?

What other functions change behaviour when used this way?

cuviper · June 24, 2019, 11:51pm

I would look to rust’s startup code to see things which happen before main().

So the stack guard is another – though at least on Linux, you’ll still have the OS guard, but it will look like a raw SIGSEGV on overflow. You’ll also miss the Thread init, which mostly just means that std::thread::current().name() will be None. I think both of these are also true of threads created outside of std::thread, e.g. with direct pthread_create.

notriddle · July 8, 2019, 7:57pm

Windows doesn't actually use main() to gather its arguments, so it'll actually work fine on there.

Unfortunately, Linux literally provides your command-line parameters as arguments to your main function. I'm not sure if it's possible to get them any other way short of trawling /proc, which would be way slower than Windows's single syscall.

zackw · July 10, 2019, 1:10pm

Would it be reasonable to add public functions to initialize and tear down the standard library, for use when main is not written in Rust? The initializer could take the C argv as an argument.

bjorn3 · July 10, 2019, 1:46pm

When using /proc using eg:

$ cat /proc/self/cmdline

the kernel will read the arguments from the position on the stack they are normally placed.

bill_myers · July 10, 2019, 5:20pm

It’s also possible to find argc/argv from the “_dl_argc”/"_dl_argv" symbols.

josh · July 10, 2019, 6:01pm

Rust’s standard library should not require “special” initialization in order to be happy with getting called as a library; down that path lies languages with runtimes.

The thread name doesn’t seem like a problem at all; even threads spawned by Rust don’t have names by default.

On some platforms we can potentially get arguments after the program starts, and on others we can’t; as long as we don’t crash on any platform, that seems fine. Having a function to set the arguments won’t work on all platforms (if you want the concept of changed arguments to apply to code not written in Rust). This doesn’t seem like a problem to me; if you want to handle arguments, you need to get them from the code implementing main(), even if written in another language. Perhaps we should have a convenient way of constructing an Args that way.

Regarding the stack guard initialization: as far as I can tell, that only exists to set up a signal handler to handle stack overflow and produce a friendlier error message. That’s a rather unexpected behavior, and programs that expect to handle SIGSEGV (or platform equivalent) themselves may find it surprising. I’d argue that we should document that better than we do, and ideally not do that by default. Could we have a compile-time option to select that behavior?

Finally, sys::init() itself does very little: on unix-like platforms it sets SIGPIPE to ignore (also not documented anywhere as far as I can tell), and on every other platform it does nothing.

So, overall, I don’t think there’s anything necessary in that initialization, and what is there we should either document or consider carefully removing or making optional.

josh · July 10, 2019, 6:04pm

That’s a glibc-private symbol that we shouldn’t use. And it can become inaccurate if your program changes where the argument area points to.

In an ideal world, I’d love to have a prctl to get the values settable by PR_SET_MM_ARG_START and PR_SET_MM_ARG_END. I don’t know of any means of doing so (other than reading /proc/self/cmdline, which depends on a mounted /proc, which we can’t count on either).

bill_myers · July 10, 2019, 6:28pm

I just found out that __attribute__((constructor)) functions get passed argc, argv and environ, so using that seems by far the best option.

josh · July 10, 2019, 7:26pm

Where do you see that documented? I don’t see that anywhere in the documentation of __attribute__((constructor)). (Using that also seems likely to produce surprising behavior in some use cases and with some toolchains/linking.)

jsgf · July 10, 2019, 7:48pm

The argv crate relies on static constructors being passed argc/argv/envp on Linux/MacOS. It doesn’t seem to be specifically defined by an ABI; glibc does pass these explicitly, but other environments may not (discussion: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223752).

josh · July 10, 2019, 8:02pm

I just posted https://github.com/rust-lang/rust/issues/62569 about Rust ignoring SIGPIPE on startup.

system · October 8, 2019, 8:03pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Windows: Does Rust need the x86/x64 C runtime to be initalized?	20	4689	May 13, 2020
Return type of main in user-defined start fn can be generics or not compiler	5	781	April 17, 2023
Problems porting standard library to my OS internals	4	589	September 3, 2024
Alternate main() signatures ideas (deprecated)	9	1958	March 25, 2019
pre-ACP: identifying privileged threads libs	15	965	December 8, 2023

Which std functions are available/work without calling main()

Related topics