0x13:reports:d3t1t05-industry-perspectives
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
0x13:reports:d3t1t05-industry-perspectives [2019/04/04 09:06] – ehalep | 0x13:reports:d3t1t05-industry-perspectives [2019/09/28 17:04] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | Day 3 / Track 1 / Talk 5 | + | Day 3 / Common |
Panel: Industry Perspectives | Panel: Industry Perspectives | ||
Panelists: | Panelists: | ||
Line 12: | Line 12: | ||
Report by: Anjali Singhai | Report by: Anjali Singhai | ||
- | This session started with the three panelists presenting their view on how Linux networking is being used in the industry followed by questions and discussion. | + | This session started with the three panelists presenting their view on how Linux networking is being used in the industry, followed by questions and discussion. |
The first presenter was Vikram Siwach, a product manager from MobiledgeX. | The first presenter was Vikram Siwach, a product manager from MobiledgeX. | ||
- | Vikram started his talk by discussing that latency is becoming important as well as AI and machine learning | + | Vikram started his talk by discussing that not only latency is becoming important, but AI as well as machine learning |
- | + | ||
- | The major issue is that a client is mobile while the cloud is static and has no notion of location of the use. Vikram proposed a new better way: CloudNet, an architecture to bring the devices closer to the cloud which features Device Native and Zero touch. | + | |
+ | The major issue is that a client is mobile while the cloud is static and as such has no notion of location of the user. Vikram proposed a new better way: CloudNet, an architecture to bring the devices closer to the cloud which features Device Native and Zero touch. The goal is to build a better cloud for these Applications: | ||
+ | Vikram then introduced the cloudlet architecture with the Distributed Matching Engine finding the best Cloudlet. The cloudlets software stack will spawn a cluster for a particular client and validate User/Client Identity and location. | ||
- | Build a better cloud for these Applications: | + | Vikram made the cast to invest in Linux. Multitenancy model emerging at Edge, workload served at Real Edge can span multiple providers to provide consistent service and APIs to developers. |
+ | Infrastructure needs to have programmability built-in, acceleration (don’t see smart NICs, no sophisticated HW, commodity HW), mobility, slicing: e2e traffic SLAs and embedded security. | ||
- | + | The Linux Kernel is a mature common denominator to offer packet programming at container, VM, host, Switch with any OS, ASIC combination based on Application needs. | |
- | See picture for Architecture: | + | The next speaker was Artur Makutunowicz from LinkedIn, who presented SoNIC and Self-Healing Networks. |
+ | Artur begun by proving a quick view on the LinkedIn infrastructure and its growth which contains approximately 250k bare metal servers in roughly 20 locations globally peering with 4000 networks. They experience a 34% growth every year with high bandwidth and compute demands due to organic growth. | ||
- | | + | Their main problems were not the design, but keeping the site up, planning for 10x growth, scale on demand, active-active datacenters and innovating for Hyperscale (Unlimited bandwidth, Compute on demand, Programmable datacenter, Scale cost effectively). |
- | DME finds best Cloudlet | + | Artur continued with the fact that they own the code and that enabled them to control their own destiny with higher velocity, more granular rollouts, while having flexibility and simplicity. |
- | Cloudlets Software Stack: | + | Their data Center Design is a single SKU data center, single chip architecture, |
- | Spawn a cluster | + | Their hardware is a custom designed merchant Silicon, no big chassis |
+ | For software they ended up using SONiC, Nos based on Linux with Minimum feature Ipv4/IPv6 and using kafka for self Healing Telemetry | ||
- | Validate User/Client Identity | + | The last speaker was Marek Majkowski from Cloudflare talking about Linux at Cloudflare. |
+ | Cloudflare has two types of Data Centers, Edge & Core. Edge Networks are in 100 locations around the world and uses anycast network. The software management of edge networks are through Uniform Config, No Virtualization, | ||
- | + | The edge Network has a uniform stack and a uniform software. | |
+ | Marek explained that they use XDP for classification and load balancing, protocols and workers (engineX). XDP doesn' | ||
- | Bring cloud closer | + | In regards |
- | Why invest in Linux? Multitenancy model emerging at Edge, workload served at Real Edge | + | Marek concluded that they use BPF everywhere and XDP for Ddos, XDP for load balance. |
- | Span multiple providers to provide consistent service and APIs to developers. | + | After the three presenters finished, there was time for questions and discussions. |
- | + | ||
- | + | ||
- | + | ||
- | Infrastructure Needs: | + | |
- | + | ||
- | Programmability in built | + | |
- | + | ||
- | Acceleration ( don’t see smart NICs, no sophisticated HW, commodity HW) | + | |
- | + | ||
- | Mobility: Anchorless | + | |
- | + | ||
- | Slicing: e2e traffic SLAs | + | |
- | + | ||
- | Security: embedded | + | |
- | + | ||
- | Often attack on Cloud, need security | + | |
- | + | ||
- | Linux Kernel is mature common denominator to offer packet programming at container, VM, host, Switch with any OS, ASIC combination based on Application needs. | + | |
- | + | ||
- | Programmable Pipeline: | + | |
- | + | ||
- | + | ||
- | + | ||
- | SoNIC and Self-Healing Networks | + | |
- | + | ||
- | Artur Makutunowicz LinkedIn | + | |
- | + | ||
- | + | ||
- | + | ||
- | ~250k bare metal servers | + | |
- | + | ||
- | ~20 locations globally | + | |
- | + | ||
- | 4000 networks | + | |
- | + | ||
- | + | ||
- | + | ||
- | Infrastructure Growth | + | |
- | + | ||
- | 34% every year | + | |
- | + | ||
- | High bandwidth and compute demands due to organic growth.. | + | |
- | + | ||
- | East-west explosion, 1:1000 | + | |
- | + | ||
- | Call graph | + | |
- | + | ||
- | Kafka | + | |
- | + | ||
- | Hadoop | + | |
- | + | ||
- | ML | + | |
- | + | ||
- | Datacenter Journey | + | |
- | + | ||
- | 2010+ Operation Crisis, keep the site up. | + | |
- | + | ||
- | Capacity Crisis: plan for 10x growth, scale on demand, active-active datac | + | |
- | + | ||
- | Innovation | + | |
- | + | ||
- | Unlimited bandwidth, Compute on demand, Programmable datacenter, Scale cost effectively. | + | |
- | + | ||
- | Own the code: | + | |
- | + | ||
- | Enables us to control our own destiny | + | |
- | + | ||
- | Higher velocity : more granular rollouts | + | |
- | + | ||
- | Data Center Design | + | |
- | + | ||
- | Single SKU data ceter, single chip architecture, | + | |
- | + | ||
- | Design Principles: | + | |
- | + | ||
- | Simplicity works at scale | + | |
- | + | ||
- | Open: use community tools when possible | + | |
- | + | ||
- | Independent: | + | |
- | + | ||
- | Programmable | + | |
- | + | ||
- | Building Block: HW | + | |
- | + | ||
- | Merchant Silicon | + | |
- | + | ||
- | Custom Design Switch | + | |
- | + | ||
- | No big chassis | + | |
- | + | ||
- | SoNIC, Nos based on Linux, Minimum feature Ipv4/IPv6 | + | |
- | + | ||
- | Self Healing Telemetry, kafka pipeline See picture | + | |
- | + | ||
- | + | ||
- | + | ||
- | Linux at Cloudflare ( Marek Majkowski) | + | |
- | + | ||
- | Global network, speed and security | + | |
- | + | ||
- | Edge & Core ( two types of Data center) | + | |
- | + | ||
- | Edge Network in 100 locations around the world , uses anycast network | + | |
- | + | ||
- | Uniform Config, No Virtualization, | + | |
- | + | ||
- | Thousands of IP | + | |
- | + | ||
- | Multiple Applications | + | |
- | + | ||
- | HHTP, DNS and Other | + | |
- | + | ||
- | HW server: moving to ARM server ( less power consumption) | + | |
- | + | ||
- | Edge Network - uniform stack | + | |
- | + | ||
- | See image | + | |
- | + | ||
- | XDP for classification and load balancing, protocols and workers ( engineX) | + | |
- | + | ||
- | XDP doesn' | + | |
- | + | ||
- | Socket Dispatch - DoS consideration | + | |
- | + | ||
- | · Case of 30k UDP sockets | + | |
- | + | ||
- | · Solution: ebpf token | + | |
- | + | ||
- | · Socket dispatch - zero downtime restart for QUIC | + | |
- | + | ||
- | · Socket dispatch: AnyIP | + | |
- | + | ||
- | · See picture | + | |
- | + | ||
- | Ebpf_exporter: | + | |
- | + | ||
- | BPF is everywhere | + | |
- | + | ||
- | XDP for Ddos, XDP for load balance etc | + | |
- | + | ||
- | + | ||
Sowmini: Kernel Bypass or Not? | Sowmini: Kernel Bypass or Not? | ||
- | + | MobileEdge: XDP sounds interesting | |
- | MobileEdge: XDP sounds interesting…Vikram. Certain | + | Linkedin: Looking for TCP Analytics, more application focused |
- | + | Cloudfare: | |
- | Linkedin: Looking for TCP Analytics… more application focused | + | |
- | + | ||
- | Cloudfare: | + | |
Tom H: Kernel upgrade is it an issue? | Tom H: Kernel upgrade is it an issue? | ||
- | + | Cloudflare: | |
- | Cloudflare: | + | linkedin: If application and network boundaries are clear, applications can be migrated for upgrade. There are still challenges. |
- | + | ||
- | linkedin: If application and network boundaries are clear, applications can be migrated for upgrade. There are still challenges. | + | |
MobileEdge: It’s a huge problem, isolate the machine right now…but linux has to be end device kernel. People slow to move to latest kernel. It’s a real problem. | MobileEdge: It’s a huge problem, isolate the machine right now…but linux has to be end device kernel. People slow to move to latest kernel. It’s a real problem. | ||
- | Rely on Distro vendor they will take care of it. Roopa | + | Roopa: |
- | + | ||
- | Need for programmability: | + | |
MobileEdge: No not for programmable HW…cost is high, not jumping on it. We still have to scale…no custom chips | MobileEdge: No not for programmable HW…cost is high, not jumping on it. We still have to scale…no custom chips | ||
- | |||
Linkedin: SW programmability is most important. Biggest value is in manageability plane, not in dataplane. P4 is mostly dataplane. | Linkedin: SW programmability is most important. Biggest value is in manageability plane, not in dataplane. P4 is mostly dataplane. | ||
- | |||
Cloudflare: Before XDP, there was need for bypass or HW offload. Don’t see the need after XDP. Good old TCP offload can work for many flows. | Cloudflare: Before XDP, there was need for bypass or HW offload. Don’t see the need after XDP. Good old TCP offload can work for many flows. | ||
- | + | Roopa asked a question on Network Analytics. Do you have any specific requirement or use today from kernel stack? | |
- | + | ||
- | Simpler Dataplanes: Switch ASICs to offload data plane in HW. Roopa? | + | |
- | + | ||
- | Much more of a usecase in edge side. No use case. | + | |
- | + | ||
- | Program the ToR switches : Mobile, need the packet to go as fast as possible. | + | |
- | + | ||
- | + | ||
- | + | ||
- | Most of SW stack written by Clodflare. | + | |
- | + | ||
- | + | ||
- | + | ||
- | Roopa question on Network Analytics: | + | |
- | + | ||
- | Do you have any specific requirement or use today from kernel stack | + | |
Linkedin: Challenge is collecting the data at scale at real time…to do the analytics. Where does the network start? Collect closed to the application or socket. | Linkedin: Challenge is collecting the data at scale at real time…to do the analytics. Where does the network start? Collect closed to the application or socket. | ||
- | + | Site: https:// | |
- | Site: https:// | + | |
- | Slides: | + | |
- | Videos: | + | |
- | + |
0x13/reports/d3t1t05-industry-perspectives.1554368814.txt.gz · Last modified: 2019/09/28 17:04 (external edit)