Session

kernel offload with complete host kernel functionalities

Speakers

Ryo Nakamura
Hajime Tazaki

Label

Moonshot

Session Type

Talk

Contents

Description

Offloading to external devices is attractive to not only accelerate the performance of a particular function but also release a computing resource of the host which allows us to allocate it for additional acomputations to programs. But there is also a serious concern that offloaded functions/programs do not have an identical feature-set, resulting in incompatibility to the host programs. An example of such concern appeared in an implementation of TCP offload engine (ToE) to NIC, which were extensively discussed in the community*1.

The kernel offload, which we propose in this talk, addresses the concern by forwarding syscalls to an external device, which runs a Linux kernel instance preserving the compatibility to the host kernel. The kernel offload uses a system call hook to intercept syscalls at the original userspace programs, and request the syscalls to the remote Linux kernel, which serves as a syscall proxy. Thus all the load of the original kernel after syscall handler is outsourced to an external device. We implemented this in a lightweight userspace program to be able to run on a recent SmartNIC device (we used Bluefield2 for the prototype). The userspace program involves Linux kernel in the form of Linux Kernel Library (LKL). A program running on the host system can remotely invoke syscalls inside an LKL instance of the userspace program running on a remote host (SmartNIC). Furthermore, we have developed optimized code paths of packet handling in LKL to improve offloaded performance.

In this talk, we will present the design of initial implementation the kernel offload, and the result of benchmarks with several workload: 1) a typical TCP transmission flow with existing offloading feature (TSO/LRO, checksum), 2) avoiding data copying between host and NIC with sendfile syscalls, and 3) kTLS offload with sendfile syscall (but not optimized as no hardware crypto offloading) over existing implementations (nginx and openssl). We will also briefly discuss the room for the future improvements, including further offload (crypto, larger segmentation size for TCP, our RDMA channel implementation, and a choice of passthrough method, etc), which we cannot cover at this moment.

*1: https://wiki.linuxfoundation.org/networking/toe