The NetDev conference wiki

This is an old revision of the document!

Day 1 / Track 1 / Talk 4 Workshop: Hardware Offload Workshop Chair: Roopa Prabhu and Or Gerlitz Report by: Anjali Singhai

a. Many devlink updates: Health Monitor (device health) to pass information from the device to upper layer.

b. Need more HW counter visibility for upper layers in the stack, right now the HW has lots of stat counters, programmable ones but they are not tied well into the different layers in the stack.

c. Packet Drop visibility in the Control plane is very important:

1. Add a mechanism to allow ASICs to report dropped packets to user space

2. Metadata can be attached to the packet

3. Drop reason, ingress port, egress port, timestamp etc

4. Drop reasons should be standardized and correspond to kernel drops ( ex: Ingress VLAN filter)

5. Mechanism should allow user to filter noisy drops, sample and truncate.

6. Filtering can be based on stages in the pipeline.

7. Devlink packet trap set DEV (all | group…) enable/disable

8. Show status and supported metadata

9. Monitor dropped packets

10. An eBPF filter can be attached to the netlink socket

(Jiri: Iptables enables tracing…looks like this infrastructure is missing in route susbsystem…tc subsystem etc…and then map that to HW)

(Tom: If I am receiving all the stuff that I am dropping…how will this scale…how do you weed one particular drop…

Policers are configured between ASIC and CPU…to limit the number of packets…

The point is not eliminate stats but augment it more)

                   ii.     Doorbell overflow, discovery and recovery for RDMA queues was a topic of discussion. (Ariel from Broadcom)

                  iii.     QoS offload for NIC eSwitch model: Focus on Ingress rate limiting and Policing.

1. Add a matchall type cls with police action

2. Introduce reserved Priorities.

a. OVS should install Tc filters with priority offset, reserve higher priority for rate limiting

3. Software→ hardware

a. Enable TC offload

b. Add bridge and interfaces

c. Configure rate limit…translates to matchall filter with police action

                                                       i.          Drop continue action

d. Configure OVS tc filters

Rony: (why did you choose priorities vs chains…recirculation is a good use case for chains..)

                iv.     Scalable NIC HW offload (Or Garlitz, Parav Pandit)

a. Large amount of HW functions

                                                       i.          Scale without using SRIOV

                                                     ii.          Multiple dynamic instances deployment at faster speed than VFs

                                                   iii.          NIC HW has very well defined vport based virtualization mode

                                                   iv.          One PCI device split into multiple smaller sub devices

                                                     v.          Each sub device comes with own devices, vport, namespace resource

                                                   vi.          Leverage mature switchdev mode and OVS eco-system

                                                 vii.          Applicable for SmartNIC use case.

                                               viii.          Use rich vendor agnostic devlink iproute2 tool

                                                   ix.          Mdev software model view

1. Mlx5 mdev devices

2. Add control plane knob to add /query remove mdev devices

a. Devlink used

3. Mentioned vDPA from Intel

4. Create 3 devices, netdev, RDMA device and representor netdev.

5. In HW mdev is attached to a vport

6. Map it to a container…cannot be mapped to a VM since single instance of driver.

                                                                                                    i.          Not connected to VFIO (it’s not necessary…), there is no buffer copy involved

Site: https://www.netdevconf.org/0x13/session.html?workshop-hardware-offload Slides: Videos: