0x13:reports:d2t3t05-v-switch-live-migration-support-for-virtio-with-sriov-vf-datapath
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
0x13:reports:d2t3t05-v-switch-live-migration-support-for-virtio-with-sriov-vf-datapath [2019/04/03 22:19] – ehalep | 0x13:reports:d2t3t05-v-switch-live-migration-support-for-virtio-with-sriov-vf-datapath [2019/09/28 17:04] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | Day 2 / Track 3 / Talk 5 | + | Day 2 / Common |
Talk – Nuts-n-Bolts: | Talk – Nuts-n-Bolts: | ||
Speakers: Or Gerlitz and Parav Pandit | Speakers: Or Gerlitz and Parav Pandit | ||
Report by: Mitu Aggarwal | Report by: Mitu Aggarwal | ||
- | This talk started with a quick reminder to give background – SRIOV has drawbacks especially with live migration. If support for live migration is required, some solutions suggested integration with VirtIO; summarizing this in terms of the vswitch looks and how the guest looks for this setup; | + | This talk started with a quick reminder to give background – SRIOV has drawbacks especially with live migration. If support for live migration is required, some solutions suggested integration with VirtIO; summarizing this in terms of how the vswitch looks and how the guest looks for this setup; |
- | Three-netdev model was proposed as a solution for live migration where the VirtIO-net is used as a failover device. Mentioned | + | Three-netdev model was proposed as a solution for live migration where the VirtIO-net is used as a failover device. Mentioned are other models that use 2 netdev or 1 netdev models; Michal from Broadcom has a blog post that discusses this; Sridhar from Intel has also worked on this in the past. |
During live migration, the VF device is hot-unplugged; | During live migration, the VF device is hot-unplugged; | ||
- | Because we are dealing with sriov, we need to take care of both the SW v-switch as well as the HW e-switch; | + | Because we are dealing with sriov, we need to take care of both the SW v-switch as well as the HW e-switch. Two HW e-switch modes were discussed. Legacy mode and Switchdev mode. This talk was about the switchdev mode. As a summary of the switchdev mode – there is a software representation in the hypervisor for the NIC eswitch VPorts in the VM. The representors used for the slow-path when the traffic are not offloaded and this is the offloading device knob. |
- | + | ||
- | Two HW e-switch modes were discussed. Legacy mode and Switchdev mode. This talk was about the switchdev mode. As a summary of the switchdev mode – there is a software representation in the hypervisor for the NIC eswitch VPorts in the VM. The representors used for the slow-path when the traffic are not offloaded and this is the offloading device knob. | + | |
The tc flower is the mechanism to offload vswitch flows to NIC eswitch. A question raised by the presentor was how do we do live migration with this vswitch? The answer was that we don’t want the SW switch to know there are two paths to the VF in the VM. So we want to do something that will support the switchdev model | The tc flower is the mechanism to offload vswitch flows to NIC eswitch. A question raised by the presentor was how do we do live migration with this vswitch? The answer was that we don’t want the SW switch to know there are two paths to the VF in the VM. So we want to do something that will support the switchdev model | ||
Line 18: | Line 16: | ||
Two things need to be done. The first is Flow based forwarding is applied to the port. The second is the need to tie the representor in the host with the paravirtualized device in the VM - so we need a mechanism to stitch it to the emulated interfaces. | Two things need to be done. The first is Flow based forwarding is applied to the port. The second is the need to tie the representor in the host with the paravirtualized device in the VM - so we need a mechanism to stitch it to the emulated interfaces. | ||
- | The design | + | The proposed |
- | On ingress the packets are received | + | On ingress the packets are received |
- | So when the hypervisor changes the channel, the hardware eswitch changes the datapath. | + | When the hypervisor changes the channel, the hardware eswitch changes the datapath |
Questions raised was adding the tc rules to the bond, the lower devices will also get the rules? So instead of adding the rule to the device, you are adding the rule to the tc block. Or said that the FW could bind to this block and do it smartly so that we don’t have to duplicate the flows in the block | Questions raised was adding the tc rules to the bond, the lower devices will also get the rules? So instead of adding the rule to the device, you are adding the rule to the tc block. Or said that the FW could bind to this block and do it smartly so that we don’t have to duplicate the flows in the block | ||
- | In the Microsoft model, only the guest knows which channel is being used in steady state; so that model doesn’t work with switchdev. Will this solution work with RDMA traffic? | + | In the Microsoft model, only the guest knows which channel is being used in steady state; so that the model doesn’t work with switchdev. Will this solution work with RDMA traffic? |
The answer was that currently doesn’t support RDMA since the failover is only for TCP/IP traffic. The RDMA connections will break if we failover to the emulated device; If the RDMA driver somehow knows how to do the mount on the other side and restart the connections, | The answer was that currently doesn’t support RDMA since the failover is only for TCP/IP traffic. The RDMA connections will break if we failover to the emulated device; If the RDMA driver somehow knows how to do the mount on the other side and restart the connections, | ||
Line 32: | Line 30: | ||
No need for dedicated interrupts or dedicated resources in the HW for this – Can just work with a dummy VPort that isn’t backed with resources. | No need for dedicated interrupts or dedicated resources in the HW for this – Can just work with a dummy VPort that isn’t backed with resources. | ||
- | Site: https:// | + | Site: https:// |
- | Slides: | + | |
- | Videos: | + | |
- | + |
0x13/reports/d2t3t05-v-switch-live-migration-support-for-virtio-with-sriov-vf-datapath.1554329984.txt.gz · Last modified: 2019/09/28 17:04 (external edit)