Change `lang_start` function signature

Hello everyone, I would like to propose changing the function signature of the lang_start function from:

fn lang_start<T: crate::process::Termination + 'static>(main: fn() -> T, argc: isize, argv: *const *const u8) -> size;

To

fn lang_start<T: crate::process::Termination + 'static>(main: fn() -> T, argc: isize, argv: *const *const core::ffi::c_void) -> size;

Rational: I am trying to port Rust std to UEFI targets. During this endeavor, I wanted to allow the users to use the normal main function rather than using no_main feature. However, this posed a few problems. Firstly, the efi entry point function has a function signature completely different from C main or Rust lang_start:

type SystemHandle = *mut c_void;
type Status = u32;
fn(_: SystemHandle, _: *mut SystemTable) -> Status;

I came up with a bit of a hack to make everything play nicely for now. The hack basically looks like this:

extern "C" main(argc: isize, argv: *const *const i8) -> isize;

fn efi_main(sh: SystemHandle, st: *mut SystemTable) -> Status {
    // Do not run. The actual code is a bit different. This is just to give a general idea about what I am doing.
    let argv = [sh, st as *mut i8].as_ptr();
    main(2, argv)
}

As most people will be aware the rustc automatically generates the C main for us, so I am just hooking into it. It is a bit of a roundabout way, but I wasn't sure how I could avoid the C trip (or if I even should) so I have left it for now.

As you can see through, while this works (and I can access the SystemTable pointer from sys::int(), it would make much more sense to cast to c_void pointers instead of i8 pointers.

Since Rust targets a lot of embedded systems and environments where arguments aren't or can't be passed, wouldn't it make more sense to not use a c_void pointer in the signature instead? Functionally, it shouldn't change a whole lot since a manual casting will be needed in the init or any other function where the platform does its initialization. However, having c_void will make it more clear that expected arguments can be things other than CLI arguments.

There, is an issue open which might allow avoiding the whole trip to C that I am currently taking, so having a void signature will make even more sense then.

Is there actually a meaningful difference here? Note that c_void is marked repr(u8), so I suspect it would end up being the same thing in the end.

Not to mention that LLVM is moving to Opaque Pointers — LLVM 15.0.0git documentation which might mean there's no difference at all in the signatures.

The difference isn't in the representation or performance, but rather what that type expresses here. The current signature is the same as C main. It basically expresses that args will be a list of ASCII strings.

Using c_void would make it more explicit that the pointers can point to basically anything and it is up to the platform to decide how to handle those args.

1 Like

It seems to me that the signature of lang_start is platform-dependent. There is nothing in principle stopping the platform from providing an entry point which is entirely different, and not just has differently typed pointers. Perhaps this platform dependence should be made explicit?

1 Like

Actually, it is not really platform-dependent. It is defined in the standard library here.

It is possible to provide a custom entry point in many ways. However, it's not really possible to hook into the normal Rust runtime setup (which takes place at lang_start_internal) from there.

Also, lang_start is not the program entry point, but rather the Rust entry point. A "C" autogenerated function runs before lang_start in all platforms that do not use no_main attribute, as far as I know.

While it might be possible to generate a custom entry function (see here), it does not change the signature of lang_start.

I think you should store the SystemHandle and SystemTable in globals rather than overloading the arguments to main. If anything is passed as argv, it should be LoadOptions converted to UTF-8. LoadOptions contains the commandline arguments.

1 Like

I was thinking of storing them as globals at the sys::init function that is called before the Rust main is called.

The rationale was simply that I should keep the efi_main function as simple as possible so that I can remove it at some point and just autogen the correct function from rustc_codegen_ssa for uefi targets.

(just a tangential note: IIRC the argv array is null terminated, so argv[argc] == 0 and isn't UB on POSIX. If smuggling data through argv, it's probably best to maintain this null terminator.)

Ok, thanks for the heads up. will do that.

Nit: u8 / c strings are not (just) ASCII.

lang_start should receive the actual argc and argv, which means the code that invokes it needs to retrieve the command-line arguments from UEFI in order to pass them.

I agree with the suggestion of obtaining the system table and handle and putting them into globals.

1 Like

Well, as I said, while I do agree that the SystemTable and Handle should be put into globals, the problem is at which exact stage/function to do so?

All the function in the startup chain barring maybe crt0 already have a pre-defined function signature. Hence, the only function that can have a different signature has to run before basically any other function in this chain, (including the generate C main).