Netdev 0x17 venue

P4TC[1][2][3][4], is a P4 kernel-native implementation on top of the Linux Traffic Control (TC) that facilitates both a kernel-based software datapath and a hardware datapath offload.

By virtue of using TC, an operator could execute one or more P4 programs:

1) entirely in the kernel datapath (baremetal, containers, VMs)

2) offload them entirely to hardware or run in a hybrid mode where some programs run in software and others in hardware

3) for each P4 program put part of the pipeline or table in hardware and part in a software exception datapath.

Starting with a P4 program definition, one would use the P4C[5] tc backend compiler to generate the necessary artifacts that are used not only to manifest the P4 program in the kernel but also the necessary control details for iproute2::tc.

Context For The Talk

There was no iproute2 or kernel code compilation required in our original approach. The entire kernel datapath definition was “scripted” (in the same nature as the u32 classifier or pedit packet editor are) and iproute2::tc control used a generated “introspection” json file to translate human friendly names to netlink wire format.

When we first posted our patches on the mailing list we received comments that suggested, that for the s/w datapath, we should look at using eBPF instead of our original scripting approach. It should be noted that we had looked at eBPF in the early stages of the project and abandoned it due to (at that time) challenges, most prominent being:

1) integrating all possible P4 programs required engaging in verifier-combat or loosing functionality just so we could use eBPF for the sake of using it.

2) the general operational complexity of a) using ebpf which is still evolving meaning: compiler and kernel changes still ongoing b)one has to become knowledgeable in two domains: ebpf and tc c) the need to compile and dealing with permutations of the compiler, libbpf and kernel instead of simply running a shell script and having to deal with.

The main argument made for eBPF on the list discussion was that it would provide better performance for the s/w datapath compared to the scripting approach we took.

So what changed? The presence of kfuncs[7] which allows us to access kernel proper function resolves challenge #1. Kfuncs didnt exist at the beginning of our project. We were motivated to proceed on integrating kfuncs in order to validate the “performance” claim. We embarked on coding and experimenting for a few months with various approaches documented in[10] using for our basis [8][9]. Our conclusion was: a) eBPF did give us performance improvement in generic tests but b) it was not always the case in larger P4 programs in presence of a combination of LPM and ternary lookups the datapath in those cases the workload was more compute intensive. We settled on a hybrid approach which uses P4 objects residing in the P4TC kernel domain made accessible to the eBPF s/w datapath via kfuncs.

The challenge of operational complexity(#2) is still, unfortunately, lurking. Although the effort was a diversion over many more months invested into a project that was already functional (and it added more months into the project’s time and resources investment), we justified the value of integrating eBPF using kfuncs with the following reasoning:

A) as a non-technical compromise to the community feedback

B) it reduces the amount of initial code we need to upstream

C) that the operational usability of eBPF will improve over time given the resource investment eBPF is getting.

D) The fact that we are putting rail guards around eBPF by using the compiler as a shield to remove the need for knowledge of eBPF and its evolution.

This Talk

In this talk we are going to describe in more details about the new architecture of the P4TC software datapath and how we integrate eBPF. We will discuss some of the challenges we faced in the conversion and integration, their respective resolutions and lessons learnt. We hope this talk will be a good reference for P4TC which we hope to be upstreaming by the conference time. Current kernel tree can be found at [6].

references

https://netdevconf.info/0x16/session.html?Your-Network-Datapath-Will-Be-P4-Scripted
https://netdevconf.info/0x16/session.html?P4TC-Workshop
https://opennetworking.org/wp-content/uploads/2022/05/Jamal-Salim-Final-Slide-Deck.pdf
https://github.com/p4tc-dev/docs/blob/main/why-p4tc.md
https://github.com/p4lang/p4c/tree/main/backends/tc
https://github.com/p4tc-dev/linux-p4tc-pub
https://docs.kernel.org/bpf/kfuncs.html
https://github.com/p4lang/p4c/tree/main/backends/ebpf/psa
https://github.com/NIKSS-vSwitch/nikss