User Tools

Site Tools


0x13:reports:d1t1t01-tcp-analytics

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
0x13:reports:d1t1t01-tcp-analytics [2019/04/02 22:01] ehalep0x13:reports:d1t1t01-tcp-analytics [2019/09/28 17:04] (current) – external edit 127.0.0.1
Line 19: Line 19:
 During the meeting there was an introduction and walkthrough of the TCP_INFO/TCP_SO_TIMESTAMPING. This part talked about details of TCP_INFO and its fields (tcp_state, options, data_seg*, delivered, retransmission) and how this information is useful in determining state of TCP flows and provides insight into TCP flows (aide to debug). During the meeting there was an introduction and walkthrough of the TCP_INFO/TCP_SO_TIMESTAMPING. This part talked about details of TCP_INFO and its fields (tcp_state, options, data_seg*, delivered, retransmission) and how this information is useful in determining state of TCP flows and provides insight into TCP flows (aide to debug).
  
-The issues that were described was that TCP_INFO is a large blob and obtaining TCP_INFO has measurable overhead due to socket lock, and still it doesn't include everything (CC used, SOL_TCP, etc..)+One of the issues that were described was that TCP_INFO is a large blob and obtaining TCP_INFO has measurable overhead due to socket lock, and still it doesn't include everything (CC used, SOL_TCP, etc..)
  
 This was followed by a covering of the usage of OPT_STATS with TCP_SO_TIMESTAMPING and its use for perf analysis. The key-takeaway was that TCP_INFO and TCP_SO_TIMSTAMPING are both powerful instrumentation but their usage must be wisely. This was followed by a covering of the usage of OPT_STATS with TCP_SO_TIMESTAMPING and its use for perf analysis. The key-takeaway was that TCP_INFO and TCP_SO_TIMSTAMPING are both powerful instrumentation but their usage must be wisely.
Line 32: Line 32:
 3. tcpstat. 3. tcpstat.
  
-For Web10G, in TCP Instrumentation, metrics are stored in hash (in memory structs). This feature can be enabled thru kernel parameter (net.ipv4.tcp_estats). This stats are accessed via netlink kernel module (TCP_ESTATS and TCP_INFO). There is user land API through limnl library for user space to query stats. +For Web10G, in TCP Instrumentation, metrics are stored in hash (in memory structs). This feature can be enabled through kernel parameter (net.ipv4.tcp_estats). This stats are accessed via netlink kernel module (TCP_ESTATS and TCP_INFO). There is user land API through limnl library for user space to query stats. 
 Web10G provides real work detail flow metric and used in multiple research such as TEACUP. It is also used in various papers exploiting buffer bloat, cloud perf, wireless latency, and network modeling reproducibility. Web10G provides real work detail flow metric and used in multiple research such as TEACUP. It is also used in various papers exploiting buffer bloat, cloud perf, wireless latency, and network modeling reproducibility.
  
Line 40: Line 40:
  
 Next was a talk about Monitoring TCP, covering challenges with respect to monitoring TCP such as "what stats to collect (TCP_CHRONO) and how frequently to sample TCP_INFO state". It also covered interesting TCP state events and how TCP-BPF opens up new possibilities. Next was a talk about Monitoring TCP, covering challenges with respect to monitoring TCP such as "what stats to collect (TCP_CHRONO) and how frequently to sample TCP_INFO state". It also covered interesting TCP state events and how TCP-BPF opens up new possibilities.
-This talk also covered TCP-BPFhow TCP-BPF can be used to provide per connection optimization for TCP parameters. It covered tunable params for intra-DC traffic such as use of small buffers, small SYN_RTO, and cwnd-clamp. TCP-BPF is a new BPF program and it provides access to TCP_SOCK_FIELDS. It means visibility to internal state of TCP flows. It also opens up mechanism of new callbacks for analytics and better decision making (w.r.t. provisioning dynamic resources). Example of new callbacks are, notify when packets are sent or received. This feature has to be used with caution - a user shouldn't enable on all flows but as needed (randomly on small % of flows) or enable while debugging atypical flow. Additionally this talk covered external trigger (e.g. TCP_INFO) like "ss". TCP-BPF per connection is not yet there but there is TCP-BPF per cgroup.+This talk also included TCP-BPF and how TCP-BPF can be used to provide per connection optimization for TCP parameters. It covered tunable parameters for intra-DC traffic such as use of small buffers, small SYN_RTO, and cwnd-clamp.  
 + 
 +TCP-BPF is a new BPF program and it provides access to TCP_SOCK_FIELDS. It means visibility to internal state of TCP flows. It also opens up mechanism of new callbacks for analytics and better decision making (w.r.t. provisioning dynamic resources). Example of new callbacks are, notify when packets are sent or received. This feature has to be used with caution - a user shouldn't enable on all flows but as needed (randomly on small % of flows) or enable while debugging atypical flow. Additionally this talk covered external trigger (e.g. TCP_INFO) like "ss". TCP-BPF per connection is not yet there but there is TCP-BPF per cgroup.
  
 The next talk covered Large Scale TCP Analytics collection. It described issues with inet_diag (referring to Telco use case) such as events getting dropped, polling takes long time, no events during connection setup and termination. It also covered issues about getting information about connections/flows (such as which congestion algorithm is used) out of kernel to user space and how this information is propagated to user space.  The next talk covered Large Scale TCP Analytics collection. It described issues with inet_diag (referring to Telco use case) such as events getting dropped, polling takes long time, no events during connection setup and termination. It also covered issues about getting information about connections/flows (such as which congestion algorithm is used) out of kernel to user space and how this information is propagated to user space. 
  
-The next talk discussed TCP Analytic at Microsoft covering real life problem being dealt in Microsoft. It covered about several classes of problems such as connectivity and performance. There are various reasons for connectivity problems such as "app failed to connect" - this could be due to network/infrastructure issues, no listener, listen backlog, firewall rules, port exhaustion, routing misconfiguration, NIC driver issues. Likewise it covered performance problems such as "why TCP throughput is so low" and its possible causes - application issue (not posting enough buffers, not draining fast-enough),+The next talk discussed TCP Analytic at Microsoft covering real life problem being dealt in Microsoft. It covered about several classes of problems such as connectivity and performance. There are various reasons for connectivity problems such as "app failed to connect" - this could be due to network/infrastructure issues, no listener, listen backlog, firewall rules, port exhaustion, routing misconfiguration, NIC driver issues. Likewise it covered performance problems such as "why TCP throughput is so low" and its possible causes - application issues (not posting enough buffers, not draining fast-enough),
  
-Following, the talk about TCP Rx window, network congestion, CPU usage, described typical analysis process for connectivity and performance problems. For connectivity issues - tracing and packet capture, detailed tracing for connection setup. To analyze performance issues and attempting to micro benchmark to rule out application issues such as time sequence plots, TCP stats, and network trace analysis.+Following, the talk was about TCP Rx window, network congestion, CPU usage, described typical analysis process for connectivity and performance problems. For connectivity issues - tracing and packet capture, detailed tracing for connection setup. To analyze performance issues and attempting to micro benchmark to rule out application issues such as time sequence plots, TCP stats, and network trace analysis.
    
 +The next talk discussed TCP stats in regards to mapping of user to servers based on TCP stats (delivery metrics). It covered stats collection methods such as random sampling (callbacks in TCP layer, additional per socket stats), usage of mmap, poll to retrieve tcpsockstat from /dev/tcpsockstat, etc... It also covered TCP_INFO and how this information could be useful to derive delivery metrics. Proposal for TCP stats collection using BPF/tracepoint, trace per socket. There was suggestion from Google to trace "sendmsg" using cmsg, TCP_INFO, and timestamping.
  
-TCP stats: This session talks about mapping of user to severs based on TCP stats (aka delivery metrics). +The final talk was related to TCP Analytic for Satellite BroadbandThis talk covered issues about TCP perf challenges and need of min RTT of 500 ms and how none of the congestion algorithm deals with it. The recommendation was to use PEP (Perf Enhancement Proxies) to avoid congestion. PEP and AQM (Active queue management) avoid packet drops. It also covered the need to monitor TCP performance issues (needed to meet Service Level Objective) and monitoring challenges 
- +1. active measurement if intrusive and not scalable 
-It covers stats collection methods such as random sampling (callbacks in TCP layer, additional per socket stats), +2. use of passive measurement (L2 stats monitoring) 
- +The talk also discussed QoE assurance and need of troubleshooting abnormality by correlating PEP TCP flow stats with RF stats.
-usage of mmap, poll to retrieve tcpsockstat from /dev/tcpsockstat, etc... It also covers +
- +
-about TCP_INFO and how this information can be useful to derive delivery metrics. Proposal for +
- +
-TCP stats collection using BPF/tracepoint, trace per socket. There was suggestion form Google +
- +
-to trace "sendmsg" using cmsg, TCP_INFO, and timestamping. +
- +
-  +
- +
-TCP Analytic for Satellite BroadbandThis session talks about TCP perf challenges and need of +
- +
-min RTT of 500 ms and how none of the congestion algorithm deals with it. Recommendation +
- +
-is to use PEP (Perf Enhancement Proxies) to avoid congestion. PEP and AQM (Active queue management) +
- +
-avoid packet drops. It also covers about need to monitor TCP performance issues (needed +
- +
-to meet Service Level Objective) and monitoring challenges: +
- +
-               active measurement if intrusive and not scalable +
- +
-               use of passive measurement (L2 stats monitoring) +
- +
-               +
- +
-It covers about QoE assurance and need of troubleshooting abnormality by correlating PEP TCP flow +
- +
-stats with RF stats. +
- +
- +
-Site: https://www.netdevconf.org/0x13/session.html?tcp-analytics+
  
 +Site: https://www.netdevconf.info/0x13/session.html?tcp-analytics
0x13/reports/d1t1t01-tcp-analytics.1554242515.txt.gz · Last modified: 2019/09/28 17:04 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki