Session

mq-cake: Scaling software rate limiting across CPU cores

Speakers

Jonas Köppeler
Toke Høiland-Jørgensen
Stefan Schmid

Label

Nuts and Bolts

Session Type

Talk

Description

Software rate limiting (such as that implemented in sch_tbf, sch_htb and sch_cake) relies on the global qdisc lock to synchronise state, and thus does not scale across CPU cores. This makes it challenging to rate limit at higher rates, since single-core performance has not kept up with network speeds. While there are workarounds for enforcing rate limits on individual traffic classes (such as splitting an HTB tree across TXQs), it is not currently possible to take advantage of multiple hardware queues and still enforce a global rate limit on the interface, using the kernel’s qdiscs.

In this work, we implement a multi-queue variant of sch_cake that can scale its rate limiting across hardware queues (and thus CPU cores). We implement this by adding a small bit of shared state across multiple sch_cake instances installed under the mq qdisc. This allows most of the qdisc logic to run under separate per-TXQ qdisc locks, while still supporting a global rate limit for the whole interface. We perform an extensive performance evaluation and show that the implementation achieves close to perfect scaling across cores, with an accuracy deviation of less than 0.2% of the configured rate.

In this talk, we will present the implementation, our performance evaluation, as well as discuss our proposal for an API that will make this work upstreamable, and applicable to other qdiscs as well.