User Tools

Site Tools


0x13:reports:d3t1t05-industry-perspectives

Day 3 / Common Track / Talk 5 Panel: Industry Perspectives Panelists: Shawn Zandi (LinkedIn) Marek Majkowski (Cloudflare) Vikram Siwach (MobiledgeX)

Panel co-chairs Sowmini Varadhan Roopa Prabhu

Report by: Anjali Singhai

This session started with the three panelists presenting their view on how Linux networking is being used in the industry, followed by questions and discussion.

The first presenter was Vikram Siwach, a product manager from MobiledgeX. Vikram started his talk by discussing that not only latency is becoming important, but AI as well as machine learning following the current state of networks. Currently there are more than 3.7 billion devices. The network CAPEX the last ten years was around 1.7 Trillion USD (mainly used as bit pipe). The cloud CAPEX the last ten years is at 300 billion USD.

The major issue is that a client is mobile while the cloud is static and as such has no notion of location of the user. Vikram proposed a new better way: CloudNet, an architecture to bring the devices closer to the cloud which features Device Native and Zero touch. The goal is to build a better cloud for these Applications: Mobile Data thinning, IOT, Mobile Gaming, New pervasive and Immersive Experiences, Drone swarming, Compliance and Privacy.

Vikram then introduced the cloudlet architecture with the Distributed Matching Engine finding the best Cloudlet. The cloudlets software stack will spawn a cluster for a particular client and validate User/Client Identity and location.

Vikram made the cast to invest in Linux. Multitenancy model emerging at Edge, workload served at Real Edge can span multiple providers to provide consistent service and APIs to developers. Infrastructure needs to have programmability built-in, acceleration (don’t see smart NICs, no sophisticated HW, commodity HW), mobility, slicing: e2e traffic SLAs and embedded security.

The Linux Kernel is a mature common denominator to offer packet programming at container, VM, host, Switch with any OS, ASIC combination based on Application needs.

The next speaker was Artur Makutunowicz from LinkedIn, who presented SoNIC and Self-Healing Networks. Artur begun by proving a quick view on the LinkedIn infrastructure and its growth which contains approximately 250k bare metal servers in roughly 20 locations globally peering with 4000 networks. They experience a 34% growth every year with high bandwidth and compute demands due to organic growth.

Their main problems were not the design, but keeping the site up, planning for 10x growth, scale on demand, active-active datacenters and innovating for Hyperscale (Unlimited bandwidth, Compute on demand, Programmable datacenter, Scale cost effectively).

Artur continued with the fact that they own the code and that enabled them to control their own destiny with higher velocity, more granular rollouts, while having flexibility and simplicity.

Their data Center Design is a single SKU data center, single chip architecture, 5 stage BGP Clos. Their design principles include simplicity works at scale, openness by using community tools when possible, independent maintaining a vendor agnostic profile, and programmability.

Their hardware is a custom designed merchant Silicon, no big chassis For software they ended up using SONiC, Nos based on Linux with Minimum feature Ipv4/IPv6 and using kafka for self Healing Telemetry

The last speaker was Marek Majkowski from Cloudflare talking about Linux at Cloudflare. Cloudflare has two types of Data Centers, Edge & Core. Edge Networks are in 100 locations around the world and uses anycast network. The software management of edge networks are through Uniform Config, No Virtualization, no containers, raw metal, thousands of IP and multiple Applications, such as HTTP, DNS and others. They’re moving their hardware server to ARM for less power consumption for the same amount of performance.

The edge Network has a uniform stack and a uniform software. Marek explained that they use XDP for classification and load balancing, protocols and workers (engineX). XDP doesn't do rate limiting right now.

In regards to DoS consideration in Socket Dispatch, the case of 30k UDP sockets is solved by using ebpf token bucket in SO_FILTER

Marek concluded that they use BPF everywhere and XDP for Ddos, XDP for load balance.

After the three presenters finished, there was time for questions and discussions.

Sowmini: Kernel Bypass or Not? MobileEdge: XDP sounds interesting but there are certain issues. Have to develop a team of kernel developers. Getting ready for next generation of Internet for low latency. Linkedin: Looking for TCP Analytics, more application focused Cloudfare: They were using kernel bypass, but our applications are CPU bonded. Manageability is the cost you pay for latency gained with kernel bypass.

Tom H: Kernel upgrade is it an issue? Cloudflare: Since we use XDP, kernel doesn’t have to be updated as often. linkedin: If application and network boundaries are clear, applications can be migrated for upgrade. There are still challenges. MobileEdge: It’s a huge problem, isolate the machine right now…but linux has to be end device kernel. People slow to move to latest kernel. It’s a real problem.

Roopa: Rely on Distro vendor they will take care of it. Need for programmability: looking for programmable Hw or just the SW MobileEdge: No not for programmable HW…cost is high, not jumping on it. We still have to scale…no custom chips Linkedin: SW programmability is most important. Biggest value is in manageability plane, not in dataplane. P4 is mostly dataplane. Cloudflare: Before XDP, there was need for bypass or HW offload. Don’t see the need after XDP. Good old TCP offload can work for many flows.

Roopa asked a question on Network Analytics. Do you have any specific requirement or use today from kernel stack? Linkedin: Challenge is collecting the data at scale at real time…to do the analytics. Where does the network start? Collect closed to the application or socket.

Site: https://www.netdevconf.info/0x13/session.html?panel-industry-perspectives

0x13/reports/d3t1t05-industry-perspectives.txt · Last modified: 2019/09/28 17:04 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki