Session

SO_TIMESTAMPING: powering fleetwide RPC monitoring

Speakers

Willem de Bruijn

Label

Nuts and Bolts

Session Type

Talk

Contents

Description

SO_TIMESTAMPING: powering fleetwide RPC monitoring

Timestamping is key to debugging network stack latency.

With SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be network issues can be attributed to the kernel. It can isolate transmission, reception and even scheduling sources. Capturing connection state along with timestamps further enables root cause discovery, such as TCP receive window size. Capturing timestamps at more points, such as traffic shaping and NIC hardware, expands visibility to tough issues like incast.

SO_TIMESTAMPING has seen iterative development to enable fleetwide RPC monitoring. The Fathom monitoring system was recently presented at SIGCOMM. This talk complements that, and begins where that paper ends: by taking a deep dive on the Linux kernel infrastructure that makes fleetwide continuous latency analysis and attribution possible.

API extensions include covering TCP bytestreams, capturing transport protocol state along with events (OPT_STAT), and supporting selective sampling (OPT_CMSG). This talk reviews the core SO_TIMESTAMPING API, discusses non-obvious extensions (MSG_EOR, SO_RCVLOWAT), summarizes gotchas from the field (OPT_ID_TCP), and explains how all this combines to enable robust continuous RPC monitoring. It touches on clock synchronization and precision. Finally, it compares this UAPI to dynamic tracing with uprobes, kprobes, tracepoints and BPF.