Precursor

Mobile, Open Hardware, RISC-V System-on-Chip (SoC) Development Kit

Jun 28, 2024

Project update 39 of 39

Xous 0.9.16: Harmonization and Encrypted Swap

by bunnie

It’s been a few months since our last release. A lot has been happening in Xous-land, so it’s been hard to pick a good time to draw a line in the sand to tag out a release. Now seemed as good a time as any. Read on for the details!

Xous Updates

Updates to precursorupdater

Please run python3 -m pip install --upgrade precursorupdater before updating to this latest release.

The kernel is now signed using ed25519ph, closing a TOCTOU between when the loader verifies the filesystem, and when the kernel is run. precursorupdater needs an update to recognize the new v2 kernel signatures. See issue #472 for a full discussion of the background; this fix also resolves an issue where portions of memory were mapped between two processes.

As a side effect, environment variables are now a feature of Xous. They are stashed just beyond the top of stack by the loader. They are mostly intended for test and debug tooling, where the loader is in the host environment, but this mechanism is also used to resolve the TOCTOU cited above by using an environment variable to pass the signature from the loader to the kernel.

Encrypted Swap

This release introduces encrypted swap memory. Encrypted swap is a tidy solution for extending the physical security benefits of a single-chip, secure microcontroller to off-chip memory resources.

In this release of Xous, per-boot unique session keys are used to encrypt pages sent to off-chip memory using Authenticated Encryption, e.g., AES-GCM-SIV. The session keys are stored entirely within the physical perimeter of the secure microcontroller and thus we’re able to transfer that security to the off-chip storage (to the extent that the cryptography is sound, and our sidechannels are limited). Cryptography nerds can read more about the implementation details and here. I’d love to have more eyes on the scheme, even if you only aspire to be a cryptographer in your spare time.

We adopted virtual memory initially for reasons of process isolation, but now I’m doubly-convinced this was a good decision because we couldn’t have implemented encrypted swap without a paging mechanism. It’s a shame that MMUs (memory management units) are not a standard feature of modern microcontrollers. Forty years ago, an MMU was a whole extra chip in a computer, but today it’s a rounding error in the gate count of a modern SoC.

Unfortunately, the licensing model of ARM saddled the embedded world with stunted MPUs ("memory protection units") that are incapable of virtual memory. It turns out virtual memory is such a good idea, you can charge extra money for it! Fortunately, RISC-V allows us to break that mold and finally build a microcontroller capable of running Xous.

More specifically, encrypted swap targets a hypothetical future "Betrusted" device which has sufficient internal RAM and ROM to boot Xous and manage swap, allowing us to balance cost and security. Why even bother with off-chip RAM? Turns out on-chip RAM is a precious commodity: even a Zen 4 CPU in TSMC N5 has "only" 32 MiB of on-chip cache memory. Thus, any practical self-contained security chip could hold just a fraction of that amount of RAM, especially assuming it is in an older (e.g., 22 nm) process and that the die size is also significantly smaller (since we need to hit a street price that’s less than 1% of a Zen 4 CPU).

Xous with encrypted swap can boot and run comfortably in a couple MiB of on-chip memory, which opens the door to using low-cost, commodity memory chips for encrypted backing storage. This allows us to re-use our existing code base with minimal refactoring in a wider range of next-gen hardware device scenarios without compromising on security. This also hopefully strikes a balance between growing an inclusive ecosystem that accommodates developers who care less about optimizing code size, without alienating the high-spec enthusiast developers.

The journey of implementing encrypted swap memory was a long but fulfilling one. You really don’t know an OS until you’ve refactored its paging mechanism. There is also a certain personal satisfaction I get out of knowing exactly what happens when things go OOM (out-of-memory) and in finally understanding all of the fancy tricks Xobs did to implement multithreading and processes.

In the end, encrypted swap was implemented in as Xous-ey a way I could muster, pushing most of the code into an isolated userspace process and using a set of feature-isolated hooks in the kernel to mediate. As a result, Precursor devices today can run the exact same code base as future devices that feature swap.

The patch also has the benefit of splitting applications into a separately encrypted region of disk, laying the groundwork for breaking up our monolithic kernel distribution into smaller, more easily updateable chunks.

You can read more about the implementation details of encrypted swap in the Xous Book chapter I’ve written.

Cryptography API

This release upgrades most of the Rust cryptography API versions to the latest stable releases. While Xous eschews release chasing, a balance must be struck with developer ergonomics. Contributor @kotval started digging into the Herculean task of porting Signal into Xous and noted that all of our API versions are incompatible with the base versions of the Signal library. So, a deliberate decision was made to upgrade all of our cryptography APIs to the latest stable versions.

The result is a mixed bag of good and bad. On the good side, we have adopted a new method for incorporating hardware encryption into the libraries by layering it into a fork of the cryptography crate itself (as opposed to vendoring in the cryptographic library). Thus, acceleration for SHA-2 and Curve25519 went through a major refactor that will hopefully make it easier to upgrade in the future. You will also notice that we no longer have explicit engine-25519 or engine-sha512 servers in the codebase or OS image.

On the bad side, some of the API changes introduced by the new Rust cryptography libraries make it harder to tweak knobs on the hardware acceleration layer. However, the hardware acceleration has been fairly stable for a while now, so I feel pretty comfortable casting the acceleration primitives into a set of safe defaults.

Automated Code Formatting

Another highly visible change to the repository is the introduction of automated code formatting. Formatting with rustfmt and trailing white space removal is now mandatory for all Xous contributions, see #477 for a discussion of how we got there and why.

The diff between v0.9.15 and v0.9.16 is fairly substantial, and a lot of it is due to whitespace and/or word-wrapping changes that were incurred as part of the move to automated code formatting.

Now, the main branch is protected and all changes must go through a pull request and pass formatting checks before merging (I am told by real software developers that this is a good practice). This will hopefully make it easier to diff out patches in the future, as well as make it easier for third parties to submit patches.

The main quirk is that we use a custom Rust format definition, which requires formatting using the nightly toolchain. Rust formatting is extremely opinionated. Unfortunately, I am also extremely opinionated. And, for reasons I don’t grasp, the Rust maintainers have refused to promote some of the most useful formatting knobs from nightly into stable. The upshot is that contributors to Xous will need to configure their IDEs to use +nightly as part of the formatting automation, otherwise their patches will be rejected.

Xous used to be a strictly no-nightly project. It feels a little sad to finally lose ground to the nightly toolchain over something as inconsequential as formatting definitions, but it also feels silly to die on the no-nightly hill given the benefits of harmonized code formatting.

Other Updates and Enhancements

Software Supply Chain Blues

A major goal for Xous is to stabilize the OS and its supply chain. While I feel that the Xous project itself is making good progress toward stability, the software supply chain is not.

Every release I have a ritual where I diff a concatenation of every dependency’s build.rs files against the previous version’s build.rs, and then go patch-by-patch and try to understand what has changed.

The reason I do this is that build.rs is a collection of commands that get run on the developer’s computer with the developer’s shell privileges. This means that every Rust crate is effectively a voluntary RCE-exploit-in-waiting.

crossbeam is now doing a weird trick where it uses an include! to pull in source for a build script from a file that’s not even a file — it’s a symlink. Assuming the symlink is not corruptable, the included file is harmless. Crossbeam is deploying a pattern I’ve seen of maintainers wrapping repositories into uber-repositories that enable better re-use of common infrastructure of dependent crates, hence the symlinks.

However, the practice illuminates the size of the attack surface for build.rs. It is not sufficient to review just those files; now I have to look for patterns of inclusion and interpretation, and we also need to beware of non-obvious or obfuscated OS-dependent code that may redirect symlinks!

The Rust project itself has made great strides toward stabilizing the language, which calls into question the wisdom of chasing the latest Rust releases. I’m slightly unhappy that Rust 1.76 had breaking changes for us, and 1.78 is deprecating compatibility with older versions of the toolchain. It is currently a warning in 1.78, but .cargo/config will be required to be named .cargo/config.toml for future versions. The stated work-around for backward compatibility is to symlink config.toml to config, but notably, this is not a solution that works for Windows targets, as symlinks don’t work the same way on Windows as they do on POSIX.

The Rust project used to care about Windows as a target, so this work-around feels like a bit of a middle finger to Windows users. Lately, I have been feeling like Rust (and llvm) is giving the middle finger to everything that’s not POSIX x86_64 running in a FAANG-scale cloud environment; they don’t worry about software supply chain security because they are the software supply chain, and of course they trust their own tools. I suppose they are entitled to do that, given who funds their payroll, but it’s not a good omen for projects like ours.

For now, we’ll continue to surf the Rust release train, but I’m definitely eying the roadmap for a good point to hop off the release train and lock things down into a fully self-contained, reproducible build system just in case things go further off the rails in the Rust project.

And with that, I’m back to hacking. If anyone has been following the Xous commit logs, you’ll notice a lot of activity around a "cramium" target. It’s still not mature enough to make any formal announcements, but it’s related to this blog post and to the encrypted swap effort. I’m hoping by early next year, I’ll be able to share more details about the work in progress, but for now it’s consuming the majority of my Xous development cycles.

Happy hacking!


Sign up to receive future updates for Precursor.

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects