0x13:reports:d3t1t05-industry-perspectives
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
0x13:reports:d3t1t05-industry-perspectives [2019/04/02 12:54] – ehalep | 0x13:reports:d3t1t05-industry-perspectives [2019/09/28 17:04] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | Day 3 / Track 1 / Talk 5 | + | Day 3 / Common |
- | Panel: Industry | + | Panel: Industry |
Panelists: | Panelists: | ||
Shawn Zandi (LinkedIn) | Shawn Zandi (LinkedIn) | ||
Line 12: | Line 12: | ||
Report by: Anjali Singhai | Report by: Anjali Singhai | ||
- | Vikram Siwach, Product manager Mobiledge | + | This session started with the three panelists presenting their view on how Linux networking is being used in the industry, followed by questions and discussion. |
- | + | The first presenter was Vikram Siwach, a product manager from MobiledgeX. | |
- | + | Vikram started his talk by discussing that not only latency is becoming important, but AI as well as machine learning following the current state of networks. Currently there are more than 3.7 billion devices. The network CAPEX the last ten years was around 1.7 Trillion USD (mainly used as bit pipe). The cloud CAPEX the last ten years is at 300 billion USD. | |
- | Panels | + | |
- | + | ||
- | People who are not using Linux or using Linux, 2 Panel chairs, Audience to ask questions | + | |
+ | The major issue is that a client is mobile while the cloud is static and as such has no notion of location of the user. Vikram proposed a new better way: CloudNet, an architecture to bring the devices closer to the cloud which features Device Native and Zero touch. The goal is to build a better cloud for these Applications: | ||
+ | Vikram then introduced the cloudlet architecture with the Distributed Matching Engine finding the best Cloudlet. The cloudlets software stack will spawn a cluster for a particular client and validate User/Client Identity and location. | ||
- | Vikarm: Telecom | + | Vikram made the cast to invest |
+ | Infrastructure needs to have programmability built-in, acceleration (don’t see smart NICs, no sophisticated HW, commodity HW), mobility, slicing: e2e traffic SLAs and embedded security. | ||
- | + | The Linux Kernel is a mature common denominator to offer packet programming at container, VM, host, Switch with any OS, ASIC combination based on Application needs. | |
- | Latency is becoming important | + | The next speaker was Artur Makutunowicz from LinkedIn, who presented SoNIC and Self-Healing Networks. |
+ | Artur begun by proving a quick view on the LinkedIn infrastructure | ||
- | + | Their main problems were not the design, but keeping the site up, planning for 10x growth, scale on demand, active-active datacenters and innovating for Hyperscale (Unlimited bandwidth, Compute on demand, Programmable datacenter, Scale cost effectively). | |
- | Current State Devices: 3.7 billion devices | + | Artur continued with the fact that they own the code and that enabled them to control their own destiny with higher velocity, more granular rollouts, while having flexibility and simplicity. |
- | Network: 1.7 Trillion USD CAPEX last 10 yrs ( mainly used as bit pipe) | + | Their data Center Design is a single SKU data center, single chip architecture, |
- | Cloud: 300 billion capex last 10 yrs | + | Their hardware is a custom designed merchant Silicon, no big chassis |
+ | For software they ended up using SONiC, Nos based on Linux with Minimum feature Ipv4/IPv6 and using kafka for self Healing Telemetry | ||
- | Client is Mobile- Cloud is static | + | The last speaker was Marek Majkowski from Cloudflare talking about Linux at Cloudflare. |
+ | Cloudflare has two types of Data Centers, Edge & Core. Edge Networks are in 100 locations around the world and uses anycast network. The software management of edge networks are through Uniform Config, No Virtualization, | ||
- | New better way: CloudNet | + | The edge Network has a uniform stack and a uniform software. |
+ | Marek explained that they use XDP for classification and load balancing, protocols and workers | ||
- | Features: Device Native, Zero touch… | + | In regards to DoS consideration in Socket Dispatch, the case of 30k UDP sockets is solved by using ebpf token bucket in SO_FILTER |
- | + | Marek concluded that they use BPF everywhere and XDP for Ddos, XDP for load balance. | |
- | Build a better cloud for these Applications: | + | After the three presenters finished, there was time for questions |
- | + | ||
- | + | ||
- | + | ||
- | See picture | + | |
- | + | ||
- | | + | |
- | + | ||
- | DME finds best Cloudlet | + | |
- | + | ||
- | Cloudlets Software Stack: | + | |
- | + | ||
- | Spawn a cluster for a particular client. | + | |
- | + | ||
- | Validate User/Client Identity | + | |
- | + | ||
- | + | ||
- | + | ||
- | Bring cloud closer to device: Make backend Mobile | + | |
- | + | ||
- | Why invest in Linux? Multitenancy model emerging at Edge, workload served at Real Edge | + | |
- | + | ||
- | Span multiple providers to provide consistent service and APIs to developers. | + | |
- | + | ||
- | + | ||
- | + | ||
- | Infrastructure Needs: | + | |
- | + | ||
- | Programmability in built | + | |
- | + | ||
- | Acceleration ( don’t see smart NICs, no sophisticated HW, commodity HW) | + | |
- | + | ||
- | Mobility: Anchorless | + | |
- | + | ||
- | Slicing: e2e traffic SLAs | + | |
- | + | ||
- | Security: embedded | + | |
- | + | ||
- | Often attack on Cloud, need security | + | |
- | + | ||
- | Linux Kernel is mature common denominator to offer packet programming at container, VM, host, Switch with any OS, ASIC combination based on Application needs. | + | |
- | + | ||
- | Programmable Pipeline: | + | |
- | + | ||
- | + | ||
- | + | ||
- | SoNIC and Self-Healing Networks | + | |
- | + | ||
- | Artur Makutunowicz LinkedIn | + | |
- | + | ||
- | + | ||
- | + | ||
- | ~250k bare metal servers | + | |
- | + | ||
- | ~20 locations globally | + | |
- | + | ||
- | 4000 networks | + | |
- | + | ||
- | + | ||
- | + | ||
- | Infrastructure Growth | + | |
- | + | ||
- | 34% every year | + | |
- | + | ||
- | High bandwidth and compute demands due to organic growth.. | + | |
- | + | ||
- | East-west explosion, 1:1000 | + | |
- | + | ||
- | Call graph | + | |
- | + | ||
- | Kafka | + | |
- | + | ||
- | Hadoop | + | |
- | + | ||
- | ML | + | |
- | + | ||
- | Datacenter Journey | + | |
- | + | ||
- | 2010+ Operation Crisis, keep the site up. | + | |
- | + | ||
- | Capacity Crisis: plan for 10x growth, scale on demand, active-active datac | + | |
- | + | ||
- | Innovation for Hyperscale | + | |
- | + | ||
- | Unlimited bandwidth, Compute on demand, Programmable datacenter, Scale cost effectively. | + | |
- | + | ||
- | Own the code: | + | |
- | + | ||
- | Enables us to control our own destiny | + | |
- | + | ||
- | Higher velocity : more granular rollouts | + | |
- | + | ||
- | Data Center Design | + | |
- | + | ||
- | Single SKU data ceter, single chip architecture, | + | |
- | + | ||
- | Design Principles: | + | |
- | + | ||
- | Simplicity works at scale | + | |
- | + | ||
- | Open: use community tools when possible | + | |
- | + | ||
- | Independent: | + | |
- | + | ||
- | Programmable | + | |
- | + | ||
- | Building Block: HW | + | |
- | + | ||
- | Merchant Silicon | + | |
- | + | ||
- | Custom Design Switch | + | |
- | + | ||
- | No big chassis | + | |
- | + | ||
- | SoNIC, Nos based on Linux, Minimum feature Ipv4/IPv6 | + | |
- | + | ||
- | Self Healing Telemetry, kafka pipeline See picture | + | |
- | + | ||
- | + | ||
- | + | ||
- | Linux at Cloudflare ( Marek Majkowski) | + | |
- | + | ||
- | Global network, speed and security | + | |
- | + | ||
- | Edge & Core ( two types of Data center) | + | |
- | + | ||
- | Edge Network in 100 locations around the world , uses anycast network | + | |
- | + | ||
- | Uniform Config, No Virtualization, | + | |
- | + | ||
- | Thousands of IP | + | |
- | + | ||
- | Multiple Applications | + | |
- | + | ||
- | HHTP, DNS and Other | + | |
- | + | ||
- | HW server: moving to ARM server ( less power consumption) | + | |
- | + | ||
- | Edge Network - uniform stack | + | |
- | + | ||
- | See image | + | |
- | + | ||
- | XDP for classification and load balancing, protocols and workers ( engineX) | + | |
- | + | ||
- | XDP doesn' | + | |
- | + | ||
- | Socket Dispatch - DoS consideration | + | |
- | + | ||
- | · Case of 30k UDP sockets | + | |
- | + | ||
- | · Solution: ebpf token | + | |
- | + | ||
- | · Socket dispatch - zero downtime restart for QUIC | + | |
- | + | ||
- | · Socket dispatch: AnyIP | + | |
- | + | ||
- | · See picture | + | |
- | + | ||
- | Ebpf_exporter: | + | |
- | + | ||
- | BPF is everywhere | + | |
- | + | ||
- | XDP for Ddos, XDP for load balance etc | + | |
- | + | ||
- | + | ||
Sowmini: Kernel Bypass or Not? | Sowmini: Kernel Bypass or Not? | ||
- | + | MobileEdge: XDP sounds interesting | |
- | MobileEdge: XDP sounds interesting…Vikram. Certain | + | Linkedin: Looking for TCP Analytics, more application focused |
- | + | Cloudfare: | |
- | Linkedin: Looking for TCP Analytics… more application focused | + | |
- | + | ||
- | Cloudfare: | + | |
Tom H: Kernel upgrade is it an issue? | Tom H: Kernel upgrade is it an issue? | ||
- | + | Cloudflare: | |
- | Cloudflare: | + | linkedin: If application and network boundaries are clear, applications can be migrated for upgrade. There are still challenges. |
- | + | ||
- | linkedin: If application and network boundaries are clear, applications can be migrated for upgrade. There are still challenges. | + | |
MobileEdge: It’s a huge problem, isolate the machine right now…but linux has to be end device kernel. People slow to move to latest kernel. It’s a real problem. | MobileEdge: It’s a huge problem, isolate the machine right now…but linux has to be end device kernel. People slow to move to latest kernel. It’s a real problem. | ||
- | Rely on Distro vendor they will take care of it. Roopa | + | Roopa: |
- | + | ||
- | Need for programmability: | + | |
MobileEdge: No not for programmable HW…cost is high, not jumping on it. We still have to scale…no custom chips | MobileEdge: No not for programmable HW…cost is high, not jumping on it. We still have to scale…no custom chips | ||
- | |||
Linkedin: SW programmability is most important. Biggest value is in manageability plane, not in dataplane. P4 is mostly dataplane. | Linkedin: SW programmability is most important. Biggest value is in manageability plane, not in dataplane. P4 is mostly dataplane. | ||
- | |||
Cloudflare: Before XDP, there was need for bypass or HW offload. Don’t see the need after XDP. Good old TCP offload can work for many flows. | Cloudflare: Before XDP, there was need for bypass or HW offload. Don’t see the need after XDP. Good old TCP offload can work for many flows. | ||
- | + | Roopa asked a question on Network Analytics. Do you have any specific requirement or use today from kernel stack? | |
- | + | ||
- | Simpler Dataplanes: Switch ASICs to offload data plane in HW. Roopa? | + | |
- | + | ||
- | Much more of a usecase in edge side. No use case. | + | |
- | + | ||
- | Program the ToR switches : Mobile, need the packet to go as fast as possible. | + | |
- | + | ||
- | + | ||
- | + | ||
- | Most of SW stack written by Clodflare. | + | |
- | + | ||
- | + | ||
- | + | ||
- | Roopa question on Network Analytics: | + | |
- | + | ||
- | Do you have any specific requirement or use today from kernel stack | + | |
Linkedin: Challenge is collecting the data at scale at real time…to do the analytics. Where does the network start? Collect closed to the application or socket. | Linkedin: Challenge is collecting the data at scale at real time…to do the analytics. Where does the network start? Collect closed to the application or socket. | ||
- | + | Site: https:// | |
- | Site: https:// | + | |
- | Slides: | + | |
- | Videos: | + | |
- | + |
0x13/reports/d3t1t05-industry-perspectives.1554209683.txt.gz · Last modified: 2019/09/28 17:04 (external edit)