Unikernels in rust

MirageOS is a library operating system that constructs unikernels for secure, high-performance network applications across a variety of cloud computing and mobile platforms. The MirageOS crew has been discussing the idea of using Rust as a language for a Unikernel. The idea of this post is to bring together people interested in the idea and explore the real traction withing both communities.

13 Likes

One reason I suggested this post for internals rather than users was that the “Write an OS in Rust” use-case still requires the use of nightly features.

Sorting out what we can do to help out people with these kinds of projects, and getting them to work on stable Rust, would be awesome.

I don’t see kernel dev being stable for quite a while. no_std still has a long way to come.

Otherwise, it wasn’t clear to me that kernels had any particularly novel requirements? inline asm and repr packed for dealing with hardware stuff, maybe? I was under the impression you kinda have to do everything from scratch, so there’s not a lot for us to provide other than making it as easy as possible to not have dependencies and to have precise control over certain operations.

We were talking today about having the basic no_std case fully stable by 1.4; basically this just entails stabilizing libcore and the crate-level attribute, both of which are fairly straightforward.

libcore still needs some dependencies that may not make sense in an embedded or kernel contexte: libpthread, libc and libm. From what I see, libm can still be useful, unless num gets separated from libcore. Libpthread is less useful when you want to write your own scheduler. What is libc used for in core, importing malloc?

Basically, for very low level code, reimplementing basic blocks of the system is not a problem, that’s part of the deal. But what must be reimplemented is a bit fuzzy, at least for me.

In the code I’m writing with the MirageOS people, I just need the no_std, lang_items and core features.

@Geal. Could you please elaborate on what are you doing with mirageos? Maybe this will shed some light to the extend rust can be used to develop low level code not backed by os libraries. Thanks.

Does it? The crate’s doc-comment says “It links to no upstream libraries, no system libraries, and no libc.”

I said “unikernels are cool, I want to do one in Rust” and went to the MirageOS people for advice. They proposed that I replace parts of Xen’s MiniOS (the bare minimum OS to run something above Xen) in Rust. That way, I don’t have to write everything from scratch to target Xen, and they could use the code right away.

MiniOS is quite small: it boots, starts some memory paging system, a scheduler, a console, and a bus to talk with the hypervisor. I am currently working on replacing the memory paging, so I’m mostly manipulating pointers.

1 Like

The compiler indicates that those libs are still necessary. In fact, If I carelessly play with numbers, it will use core::num, which will directly require libm to link correctly.

On #rust-osdev, Tobba compiled a list of tips to work with low level Rust, which include using link time optimization to remove the code depending on those libraries.

https://github.com/rust-lang/rust/issues/27200 has a complete list of the symbols libcore depends on.

The expectation is that you’ll link against libgcc plus a small library defining a few other necessary functions like memcpy.

FYI, MirageOS already depends on openlibm. It is introduced in the mirage-xen-posix package, which fills in some gaps between Mini-OS and the OCaml runtime.

Note that apparently you can’t link against libgcc on x86_64, because it assumes a red zone exists, which isn’t the case for kernel code:

Sorry if I’m saying something extremely naive or uninformed , but I don’t understand much of this discussion about if it is possible when there’s actually a bare bones rust kernel I haven’t tried it yet, but apparently it boots in bare metal.

In addition to the points mentioned, I think there are some interesting unanswered questions about what lifetimes mean in the context of context-switching / scheduling code, and how they can or can’t be exploited to make such code safer. I expect similar issues as those we faced with scoped threads to pop up.

Also, there are a number of library considerations: e.g. custom allocators, more crates behind the facade (including moving hash{set,map} to libcollections).

There are also a couple proposed language features that are extra relevant with freestanding development:

  • &in &out and &uninit are more essential when working with inline assembly, as passing things larger than a register by value is simply not an option.
  • GC integration might help MirageOS and HalVM in particular

@pablochacin The issue is while such things can be, the experience is less nice than it could be. I’d like to see freestanding Rust beat freestanding C as much as hosted Rust beats hosted C, but we just aren’t there yet.

@Ericson2314

In addition to the points mentioned, I think there are some interesting unanswered questions about what lifetimes mean in the context of context-switching / scheduling code

My understanding is that on a unikernel, by definition, there is neither context-switching nor scheduling as in traditional multi-application/process kernels.

My understanding is that on a unikernel, by definition, there is neither context-switching nor scheduling as in traditional multi-application/process kernels.

Threads on Linux are implemented in terms of processes. So there are issues of context switching for the stack and scheduling as with processes. Are Unikernels always single threaded? It seems like it would be a tedious limitation.

No, but I was thinking more on this approach:

"In MirageOS, the OCaml compiler receives the source code for an entire kernel's worth of code and links it into a stand-alone native-code object file. It is linked against a minimal runtime that provides boot support and the garbage collector. There is no preemptive threading, and the kernel is event-driven via an I/O loop that polls Xen devices." - Unikernels: Rise of the Virtual Library Operating System

Does the context switching concern still apply here?

In a unikernel, you implement your own scheduler. You have to think in terms of cores instead of threads. On a single core, threads simulate multiple parallel running code. This is something that can be implemented with green threading, coroutines, etc. This happens more or less the same way on a traditional OS, except that processes separate the memory.

Of course, it gets harder if you want to address multiple cores.

Good point.

Yeah I meant context switching and scheduling with threads. Consider making scoped threads from coroutines. I have a library where threads pass their current continuation to the new thread on context switches along with other data.

Consider:

  1. Thread A switches to thread B, also passes pointer stack variable
  2. Thread B switches back to thread A
  3. Thread A switches back to thread B
  4. Thread B uses the pointer

This is unsafe because the relevant stack frame in A could no longer be valid. The way to fix this is to ensure that the pointer may not outlive the continuation (which is consumed when switching back).

I think existential lifetimes (an I idea I came up with elsewhere) would accomplish this. Intuitively, there is a lifetime bound between the continuation and the pointer. But since the lifetime depends on A’s stack, there is no good “name” for it from B’s perspective, nor is it related to B’s other lifetimes. The thing to do is introduce a fresh lifetime (the “existential lifetime”) and say that the context outlives it but the pointer doesn’t.