Relevant issue regarding ecosystem of no_std
crates:
https://github.com/rust-lang/rust/issues/38509
This recent seL4 blog post may be of interest:
One thing that I only saw mentioned casually in this thread is the need for a write/flash/debug cycle. It’d be good if cargo run
was expanded to be able to use various programming tools to program the device. If this could be expanded such that it had a debug mode of some kind where it didn’t just flash the device but it brought up a debugging interface to use while running on the remote hardware.
I’m not asking for cargo
to learn how to interface with all of the different hardware out there, but it’d be great if you could cargo install
a toolchain effectively which would plug into this cargo run
and cargo debug
architecture for various devices. I don’t know exactly how this would look, I’d need to think about it a little bit more, but the premise would be that there would be a way to declare platforms and then how run
and debug
would work for those platforms. Ideally these would be able to use the existing crates.io infrastructure and be fetched automatically.
Sorry if this is thread necromancy.
Something that I didn’t see in this thread is making #[test]
useful on embedded targets. The test harness generated by rustc --test
requires std
support, which makes it basically useless in this context.
What I’d love to see would be refactoring rustc
so that it builds a library for each function marked with #[test]
, then by default the standard test harness is generated to call each of those library functions in turn and return the result. We can then replace this second step to run things in some other way that is meaningful on an embedded platform (probably QEMU-related). I’m not sure how this interacts with the stability rules though.
Regardless of how it’s implemented, I think this is a problem that needs solving. First-class testing support is one of the best things about the Rust environment, and until it works embedded targets will always be second-class citizens.
making #[test] useful on embedded targets
What do you have in mind? Because I see two approaches to testing (just to narrow down the problem) Cortex-M code:
- You use QEMU.
With QEMU, you can basically emulate a Cortex-M processor that has access to the host's linux kernel (QEMU user emulation). With this you can, for example, write a Hello World program that uses the WRITE syscall to print to stdout (see my cortest experiment). Following this road, you could port a minimal std
and test
to directly interface the Linux kernel (this is exactly what the steed project is doing) and that works on a QEMU-lated Cortex-M processor so #[test]
would just work for Cortex-M specific crates.
The advantage of this approach is that #[test]
and cargo test
would work for Cortex-M targets as they do for other targets. The disavantage is that you can't test any code that interacts with the hardware of your microcontroller (i.e. stuff like UART, SPI, I2C, etc.) because QEMU only implements a limited version of such peripherals for a few specific microcontrollers.
So this approach is basically just useful to test for codegen, struct layout and maybe endianness bugs. After all, if your code doesn't interact with the microcontroller hardware, you may as well just test it (its logic) on the host (cargo test --target $HOST
).
- You build a Hardware In the Loop (HIL) test framework.
Basically you reimplement #[test]
(as a compiler plugin) such that code like this:
#[test]
fn foo() {
// test some hardware stuff
}
#[test]
fn bar() {
// test more hardware stuff
}
compiles to a device-specific binary that in conjunction with maybe some Cargo subcommand (e.g. cargo hil-test
) flashes the binary, runs it, communicates the device with the host and reports back the failures / results to the host.
The advantage of this approach is that you can actually test hardware stuff. The downside is that I doubt this can reuse much of the test
crate code / rustc --test
infrastructure and that this probably will require some sort of shim / glue code for every device that the framework wants to support (each device has different register maps, number of peripherals, etc.)
So this approach is basically just useful to test for codegen, struct layout and maybe endianness bugs. After all, if your code doesn't interact with the microcontroller hardware, you may as well just test it (its logic) on the host (cargo test --target $HOST).
I see what you're saying here but I don't think it's really accurate. To give one specific example: let's say that I'm writing an RTOS for a Cortex-M. I need to test and debug my context-switching code. This definitely requires target-specific code, but as it's only interacting with registers and memory that is emulated by QEMU, it's totally testable. Another example of useful tests to run in QEMU might be benchmarking when doing performance optimization. And even if it were true that this was only good for codegen bugs, that's still a pretty big win in the meta-problem of making it easier to bring Rust up on new platforms. So even if we did restrict testing of embedded platforms to just QEMU, I think it would be worth it.
But I feel like doing that would be missing the point a bit.
Basically you reimplement #[test] (as a compiler plugin)
The reason that I like the idea of compiling "test libraries" for each test function so much is that it doesn't require this. It makes the testing story so much more flexible to be able to link test structures into arbitrary test harnesses. I'm no expert in software testing, but I'd imagine that this would be useful even outside the embedded realm - maybe you want to hook your test functions up to a fuzzer, for example (I'm not sure this would work in the current scheme since test functions can't take arguments but it's just a thought).
Just to make sure that I am communicating effectively here, this is a sketch of what I imagine:
src/main.rs:
#[test]
fn foo() {
// test some stuff
}
#[test]
fn bar() {
// test more stuff
}
After running rustc --test src/main.rs
:
$ ls target/debug
...
libmycrate_test_foo.rlib
libmycrate_test_bar.rlib
...
$ objdump -t target/debug/libmycrate_test_foo.rlib
__horrible_mangled___TEST_ENTRY_POINT_foo
__horrible_mangled___other_function_required_by_foo
$ cargo test
$ ls target/debug/
...
mycrate_test-somehash
...
We move the test harness generation into cargo test
, workflow for 99+% of users is unchanged, but now we can write our own cargo test-hil
much more easily without having to do crazy things with compiler-internal crates.
The downside is that I doubt this can reuse much of the test crate code / rustc --test infrastructure and that this probably will require some sort of shim / glue code for every device that the framework wants to support (each device has different register maps, number of peripherals, etc.)
This is kind of inevitable, to my mind, at least in the short term. If traits for (e.g.) I2C peripherals are standardized, as I think is kicked around higher in this thread, maybe we can do better someday, but for now I'll take 'works but needs a shim' over 'doesn't work unless you do unsavory things with librustc'.
This definitely requires target-specific code, but as it's only interacting with registers and memory that is emulated by QEMU, it's totally testable.
That is the third option: QEMU system emulation. The reason I didn't mention it above is that, IME, it's buggy. You can read / write to memory regions that don't map to any device memory and QEMU doesn't crash or raise a hardware exception. Also, my understanding is that their emulation of core peripherals like SysTick is not accurate and that's why other projects use forks of QEMU.
If your emulation is not accurate then testing on QEMU will give you a false sense of security e.g. tests pass but your program crashes on real hardware.
Another example of useful tests to run in QEMU might be benchmarking when doing performance optimization.
I don't know if QEMU emulates the CPU pipeline at all but I doubt it emulates Flash / RAM access latencies for specific devices. I certainly would not benchmark code in an emulator if I have hardware at hand; I doubt the emulator measurements would be accurate.
Just to make sure that I am communicating effectively here, this is a sketch of what I imagine:
But rustc --test
is much simpler than that, it basically just adds a main
function to your crate that looks like this (I'm ignoring command line argument
parsing, synchronization of stdout printing, etc.):
fn main() {
const TESTS: &'static [fn()] = &[
// collection of paths to unit tests provided by `rustc`
];
for test in TESTS {
thread::spawn(|| {
let res = panic::catch_unwind(|| {
test();
});
// printing code here
});
}
}
What you have sketched there requires a more involved integration with Cargo and
non trivial changes to how rustc --test
works. Landing such changes is going
to require an RFC and a prototype showing that this system integrates nicely
with Cargo subcommands, etc. And the prototype is going to require either
re-implementing #[test]
as an out of tree compiler plugin or maintaining forks
of rustc
and Cargo (the second option is less accessible to users during the
prototype phase).
This is kind of inevitable, to my mind, at least in the short term.
It is inevitable. The amount of device specific code can be minimized though if you stick to core peripherals to implement the functionality that the framework needs: e.g. use SysTick for timing / timeouts, use semihosting / ITM to report results to the hosts, etc. The problem comes when the code under test wants to use SysTick or ITM; that's where you have to provide device-specific alternatives to implement the framework functionality: e.g. use USART1 for reporting results, etc.
QEMU doesn't do any of this, it's not a performance simulator. Also, QEMU is a Quick EMUlator - it converts guest code into IR which it then tries to optimize using both host-agnostic methods and host-specific backends. It should make the measurements even less reliable.
Okay, fair enough. I don't actually know that much about how QEMU works under the hood - sounds like it wouldn't actually be very useful.
But
rustc --test
is much simpler than that, it basically just adds a main function to your crate that looks like this:
But that requires thread::spawn and panic::catch_unwind to work, which immediately makes it unusable on #[no_std]
targets.
Regarding simplicity - on further reflection there's no particularly good reason to make multiple test libs. If we drop that feature, then the changes simplify to this:
#[test]
impliescfg(test)
and also marks the function as a test function somehow (two ideas that come to mind are special mangling or a.test
session in the output library)rustc --test
effectively becomesrustc --lib --cfg test
- The test harness code generation you discuss above moves to
cargo test
instead (this is the biggest issue)
In terms of final implementation, that actually seems simpler to me, and what complexity there is is pushed further out of the language core (where it's easier to modify). User experience is unchanged for most use cases, while strictly more things are possible without modifying rustc
. But yes, there is some complexity involved in getting it adopted and writing it in the first place.
The objection I have to how rustc --test
works now is that it bakes a particular implementation of testing into the language, rather than just the concept of testing. This doesn't seem like what we want.
I have created a custom test
crate that supports no_std
targets along with two test runner implementations. One supports emulated Cortex-M processors (QEMU) and the other supports real Cortex-M microcontrollers. See utest for details.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.