We're excited to announce Unikraft v0.8.0 (Enceladus) and to show off many of the things the community has been working on over the last two months.
In this blog post, we highlight some of the new features available in Unikraft. A full list of features can be found on the releases page for Enceladus
Pointer Authentication (PAuth) allows signing and authenticating pointers to harden the system against classes of attacks that rely on the manipulation of pointers, such as ROP. Pointer Authentication Codes (PACs) are created by signing a pointer and a 64-bit modifier with an 128-bit key. The modifier is a value that is normally used to restrict the PAC to a specific context (eg the SP). Keys are stored in registerers.
The PAC is stored in the unused upper bits of the memory address, and can be later verified to ensure that the pointer has not been tampered.
The reference algorithm is the QARMA block cipher, but it is possible for architectures to use an IMPLEMENTATION DEFINED algorithm instead.
A platform can initialize PAuth as:
#ifdef CONFIG_ARM64_FEAT_PAUTHif (ukplat_pauth_enable())UK_CRASH("Pointer Authentication is not available");#endif
To generate PAyth keys, it is required that platforms provide an implementation of the key generation function that uses an adequate source of randomness.
void ukplat_pauth_gen_key(__u64 *key_hi, __u64 *key_lo);
Platforms that support Armv8.5 can use the random number generation instructions introduced into the architecture (RNDR
/RNDRSS
).
Others can initialize drivers of an HWRNG/TRNG, or request randomness from the TEE.
Since the above functions are used for the initialization of PAuth, it is required that GCC excludes them when generating PACIASP/RETAA sequences. Otherwise, once PAuth is enabled, authentication will fail upon return. A macro is provided to help overriding the global gcc settings per function:
#define __no_pauth __attribute__((target("branch-protection=none")))
This macro is used by pauth_enable()
, and should be used by all functions generated by GCC in the boot chain, up to - and including - the caller of pauth_init()
.
Additionally to the above, we decided against enabling backwards compatibility mode. If -march=armv8.3-a
is not set, it is not possible to initialize the keys.
GCC exits with the following error:
bash$ ~/toolchains/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gcc -mbranch-protection=standard /tmp/test.c/tmp/ccx7K1nD.s: Assembler messages:/tmp/ccx7K1nD.s:17: Error: selected processor does not support system register name 'apiakeyhi_el1'
Clearly on Unikraft this feature cannot be utilized, except from the case of linking with prebuilt libraries.
Because of this, this change sets -march=armv8.3-a
when enabling CONFIG_ARM64_FEAT_PAUTH
.
We believe that Arm's intention behind compatibility mode was mostly to provide backwards compatibility on userspace libraries.
GCC 7
--msign-return-address=[none|non-leaf|all]
The parameters passed to --msign-return-address
are interpreted as:
none
: Do not sign return addressesnon-leaf
: Sign/auth the return address of non-leaf functions.all
: Sign/auth the return address of non-leaf and leaf functions.GCC implements the Basic Set of PAuth instructions using the HINT instruction, in the so-called NOP space. This allows executing protected binaries on older platforms that do not implement Armv8.3-a. In these platforms PAuth instructions execute as NOP.
When sign-return-address
is enabled without setting -march=armv8.3-a
, GCC generates PACIASP / AUTIASP sequences that sign and authenticate the LR upon function entry and exit.
In the following snippet notice the opcode of the HINT instruction (0xd5032300) that GCC generates for PACIASP and AUTIASP.
0000000000400564 <main>:400564: d503233f paciasp400568: a9bf7bfd stp x29, x30, [sp, #-16]!40056c: 910003fd mov x29, sp400570: 90000000 adrp x0, 400000 <_init-0x3e8>400574: 9118c000 add x0, x0, #0x630400578: 97ffffb6 bl 400450 <puts@plt>40057c: 52800000 mov w0, #0x0400580: a8c17bfd ldp x29, x30, [sp], #16400584: d50323bf autiasp400588: d65f03c0 ret
When -msign-return-address
is used along with -march=armv8.3-a
, GCC generates PACIASP / RETAA sequences instead.
In the snippet below notice the opcode of RETAA (0xd65f0bff) which is in the fused space (instructions in the Combined Set do not have HINT implementations).
Clearly, this code cannot be executed in architectures earlier than Armv8.3-a.
0000000000400564 <main>:400564: d503233f paciasp400568: a9bf7bfd stp x29, x30, [sp, #-16]!40056c: 910003fd mov x29, sp400570: 90000000 adrp x0, 400000 <_init-0x3e8>400574: 9118a000 add x0, x0, #0x628400578: 97ffffb6 bl 400450 <puts@plt>40057c: 52800000 mov w0, #0x0400580: a8c17bfd ldp x29, x30, [sp], #16400584: d65f0bff retaa
GCC 9
-msign-return-address
-mbranch-protection=[none|pacret{+leaf}|bti|standard]
--enable-standard-branch-protection
as a short to mbranch-protection=standard
The parameters passed to -mbranch-protection
are interpreted as:
none
: Disables all protectionspac-ret
: Enables PAuth for function returns on non-leaf functions.
The +leaf
modifier enables protection for leaf functions.bti
: Enables Branch Target Identification.standard
: Enables all protections.GCC 10:
--mbranch-protection
.
This allows using the APIB_Key instead of the APIA_Key when signing return pointers.For more information on this feature:
Currently, Unikraft establishes a virtual address space using a fixed boot page table that maps the first 1 GiB of memory.
This change introduces an architecture-independent API for dynamically (un-)mapping virtual to physical addresses, setting virtual memory protections, and dynamically managing physical memory.
This allows Unikraft to access all available RAM in the system and enables the implementation of system calls such as mmap
.
The frame allocator is a multi-zone buddy-based allocator that is optimized for power-of-two allocations, but can also handle arbitrary allocations without wasting memory like a conventional buddy allocator.
Also allocations and frees do not have to match allowing to dynamically manage physical memory on a page (4K) granularity.
Establish a read-write mapping of 16 MiB (= 4096 pages) at the virtual address 0x10000000000
:
struct uk_pagetable *pt = ukplat_pt_get_active();ukplat_page_map(pt, 0x10000000000, __PADDR_ANY, 4096, PAGE_PROT_RW, 0);
The call implicitly allocates the physical memory (due to __PADDR_ANY
) and maps it to the given virtual address.
The mapping will automatically use the largest page sizes possible according to the alignment, mapping size, available continuous physical memory, and page sizes supported by the architecture.
On x86_64, for example, the call will create a mapping of 8 large pages (2 MiB).
Force a mapping of a particular page size:
ukplat_page_map(pt, 0x10000000000, __PADDR_ANY, 8, PAGE_PROT_RW, PAGE_FLAG_SIZE(PAGE_LARGE_LEVEL) | PAGE_FLAG_FORCE_SIZE);
Unmap a range of memory (here 10 4K pages):
ukplat_page_unmap(pt, 0x10000000000, 10, 0);
This will split the first large page created in the previous calls and unmap the first 10 small pages.
This will also free the frames unless PAGE_FLAG_KEEP_FRAMES
is specified.
If page tables can be released during the operation this is done transparently (except PAGE_FLAG_KEEP_PTES
is specified).
Note that it is no problem to unmap physical address ranges previously not allocated by the underlying frame allocator (for example kernel code and data), even if PAGE_FLAG_KEEP_FRAMES
is not specified.
Mapping a specific physical address can be done by using the physical address instead of __PADDR_ANY
.
However, this requires that the physical memory has been allocated manually by calling the frame allocator first:
__paddr_t paddr = 0x40000000; /* Physical memory at 1 GiB, __PADDR_ANY for any physical address */pt->fa->falloc(pt->fa, &paddr, 10); /* paddr receives allocated address if __PADDR_ANY */ukplat_page_map(pt, 0x10000000000, paddr, 10, PAGE_PROT_RW, 0);
If the physical memory allocation should be restricted to a certain range (e.g., for DMA), this can be done like so:
__paddr_t paddr;pt->fa->falloc_from_range(pt->fa, &paddr, 10, 0, MIN, MAX);
Sometimes direct access to the page table entries (PTEs) is needed. For example to store custom information in user-available bits or set advanced caching flags.
unsigned int level = PAGE_LEVEL; /* perform complete walk down to page level */__vaddr_t pt_vaddr; /* OPTIONAL, receives virtual address where the page table is mapped that contains the PTE */__pte_t pte; /* OPTIONAL, receives the PTE */ukplat_pt_walk(pt, 0x10000000000, &level, &pt_vaddr, &pte);/* Set some extended PTE flag */pte |= MY_EXTENDED_FLAG;/* Update the PTE in the page table */ukarch_pte_write(pt_vaddr, level, PT_Lx_IDX(0x10000000000, level), pte);ukarch_flush_tlb_entry(0x10000000000);
For more information on this feature:
uk/paging.h
for all exposed methods.For checksum offloading for TCP and UDP flows, typically, a partial checksum needs to be computed. This checksum covers only the pseudo header of the protocols. By specifying a start pointer and another pointer to the checksum field, a hardware network device is able to complete the computation of the checksum.
This change introduces support for lib/uknetdev
for such a feature.
Two new flags and two fields for pbufs are introduced to specify information needed by the driver to offload the computation for a packet.
Additionally, a new uknetdev
driver state is introduced: UK_NETDEV_UNPROBED
.
This state indicates that network devices features weren't negotiated with the device yet
During device initialization, a call to uk_netdev_probe()
is becoming mandatory to turn the device into UK_NETDEV_UNCOFNIGURED
state.
For some devices (like virtio-net
) this is required to figure out if the backend supports partial checksumming.
uk_event
trap interface#uk_event
decouples the definition of an event and the definition of callback routines that should be called whenever the event ocurrs.
This way a library (A) can provide an event and a library (B) can register handlers for this event without creating a dependency (A)->(B).
Handler registration is done at link time by collecting all handler functions for a particular event and putting their addresses into an event-specific function table.
Handlers can indicate that they have handled the event, in which case no other handlers are called, if desired.
Handlers can also be assigned a priority class to enforce a certain execution order.
In the following, we define an event myevent
in library (A).
We encapsulate additional event-specific information in struct myevent_data
.
Library (B) includes the declaration of struct myevent_data
and registers three handler functions, where myhandler1
and myhandler2
allow other handlers to be called afterwards and myhandler3
terminates handler invocation.
Since there are no guarantees in which order handlers of the same priority class may be called, it is only guaranteed that handler1
will be the first of libB's handlers that is invoked.
However, handler2
may or may not be called depending on if handler3
is executed before.
libA/include/uk/myevent.h:
struct myevent_data {int dummy;};
libA/myevent.c:
#include <uk/event.h>#include <uk/myevent.h>UK_EVENT(myevent);void some_function(void){struct myevent_data = { .dummy = 42 };int rc;rc = uk_raise_event(myevent, &myevent_data);if (!rc)uk_pr_err("myevent not handled!\n");}
libB/myhandler.c:
#include <uk/event.h>#include <uk/myevent.h>int handler1(void *arg){struct myevent_data *data = (struct myevent_data *)arg;...return UK_EVENT_HANDLED_CONT;}int handler2(void *arg){return UK_EVENT_HANDLED_CONT;}int handler3(void *arg){return UK_EVENT_HANDLED;}UK_EVENT_HANDLER_PRIO(handler1, UK_PRIO_EARLIEST);UK_EVENT_HANDLER_PRIO(handler2, UK_PRIO_LATEST);UK_EVENT_HANDLER_PRIO(handler3, UK_PRIO_LATEST);
For more information on this feature:
Additional support for providing isr-safe functions for uklock
and ukmpi
libraries is made by aliasing to some inlined functions, that can be called in an interupt context, in uk/isr/mutex.h
and uk/isr/semaphore.h
.
For mbox, the change moves the functions that can be called in an interupt context to a separate sorce file, mbox_isr.c
, and moves some static inlined functions needed by both mbox and mbox_isr
in the mbox.h
header, along with the definition of struct uk_mbox
.
Unikraft is part of Google Summer of Code 2022 (GSoC’22), a global online program focused on bringing new contributors into open source software development. We are welcoming fresh members into the Unikraft community to work on cool open source projects during paid summer internships.
The GSoC’22 application is open between April 4th and April 21st, 2022. Please see our guideline for application and join our Discord Server for detailed discussions (we have a channel dedicated to GSoC).
Feel free to ask questions, report issues, and meet new people.