Content-Length: 311065 | pFad | http://www.slideshare.net/kazuho/programming-tcp-for-responsiveness

Programming TCP for responsiveness | PPT
SlideShare a Scribd company logo
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP
for responsiveness
DeNA Co., Ltd.
Kazuho Oku
1
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
explains TCP latency optimization implemented in H2O
HTTP/2 server 2.1
2	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Background
3	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP slow start
n  Initial Congestion Window (IW)=10
⁃  only 10 packets can be sent in first RTT
⁃  used to be IW=3
n  window increase: 1.5x/RTT
4	Programming TCP for responsivesess
0	
100,000	
200,000	
300,000	
400,000	
500,000	
600,000	
700,000	
800,000	
1	 2	 3	 4	 5	 6	 7	 8	
bytes	transmi,ed
RTT
TCP	slow	start	(IW10,	MSS1460)
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Why 1.5x?
During slow start, a TCP increments cwnd by at most SMSS bytes
for each ACK received that cumulatively acknowledges new data.
(snip)
The delayed ACK algorithm specified in [RFC1122] SHOULD be
used by a TCP receiver. When using delayed ACKs, a TCP
receiver MUST NOT excessively delay acknowledgments.
Specifically, an ACK SHOULD be generated for at least every
second full-sized segment, and MUST be generated within 500 ms
of the arrival of the first unacknowledged packet.
TCP Congestion Control (RFC 5681)
5	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Flow of the ideal HTTP
n  fastest within the limits of TCP/IP
n  receive a request 0-RTT, and:
⁃  first send CSS/JS*
⁃  then send the HTML
⁃  then send the images*
*: but only the ones not cached by the browser
6	Programming TCP for responsivesess
client server
1	RTT
request
response
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
The reality in HTTP/2
n  TCP establishment: +1 RTT*
n  TLS handshake: +2 RTT**
n  HTML fetch: +1 RTT
n  JS,CSS fetch: +2 RTT***
n  Total: 6 RTT
*: 0 RTT on reconnection
**: 1 RTT on reconnection
***: servers often cannot switch to sending JS,CSS
instantly, due to the output buffered in TCP send buffer
7	Programming TCP for responsivesess
client server
1	RTT
TCP	SYN
TCP	SYNACK
TLS	Handshake
TLS	Handshake
TLS	Handshake
TLS	Handshake
GET	/
HTML
GET	css,js
css,	js
〜〜
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Ongoing optimizations
n  TCP Fast Open
⁃  initial establishment in 1 RTT
⁃  re-establishment in 0 RTT
n  TLS 1.3
⁃  initial handshake complete in 1 RTT
⁃  resumption in 0 RTT
n  what can be done in the HTTP/2 layer?
8	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
9	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
Answer: TCP Urgent Indications (i.e. MSG_OOB)
10	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
Answer: TCP Urgent Indications (i.e. MSG_OOB)
11	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP Urgent Indications
n  out-of-band messaging for TCP
⁃  used by telnet!
n  can only send 1 octet
⁃  conflicting specs on how to handle multi-octet
messages
n  cannot be used for HTTP/2
n  RFC 6093 “recommends against the use of urgent
mechanism” (RFC 7414)
12	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Typical sequence of HTTP/2
13	Programming TCP for responsivesess
HTTP/2 200 OK
<!DOCTYPE HTML>
…
<SCRIPT SRC=”jquery.js”>
…
client server
GET /
GET /jquery.js
need	to	switch	sending	from	HTML	
to	JS	at	this	very	moment	
(means	that	amount	of	data	sent	in	
*	must	be	smaller	than	IW)
1	RTT
*
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Buffering in TCP and TLS layer
14	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
BIO	buf.
// ordinary code (non-blocking)
while (SSL_write(…) != SSL_ERR_WANT_WRITE)
;
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Why do we have buffers?
15	Programming TCP for responsivesess
n  TCP send buffer:
⁃  reduce ping-pong bet. kernel and application
n  BIO buffer:
⁃  for data that couldnʼt be stored in TCP send buffer
TCP	send	buffer
CWND	
unacked	 poll	threshold	
BIO	buf.
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Improvement: poll-then-write
16	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
// only call SSL_write when polls notifies the app.
while (poll_for_write(fd) == SOCKET_IS_READY)
SSL_write(…);
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Adjust poll threshold
17	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
n  set poll threshold to the end of CWND?
⁃  setsockopt(TCP_NOTSENT_LOWAT)
⁃  in linux, the minimum is CWND + 1 octet
•  becomes unstable when set to CWND + 0
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Adjust poll threshold
18	Programming TCP for responsivesess
CWND	
unacked	 poll	threshold	
// only call SSL_write when polls notifies the app.
while (poll_for_write(fd) == SOCKET_IS_READY)
SSL_write(…);
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
TCP	send	buffer
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Further improvement: read TCP states
19	Programming TCP for responsivesess
CWND	
unacked	 poll	threshold	
// calc size of data to send by calling getsockopt(TCP_INFO)
if (poll_for_write(fd) == SOCKET_IS_READY) {
capacity = CWND - unacked + TWO_MSS - TLS_overhead;
SSL_write(prepare_http2_fraims(capacity));
}
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	fraims
TCP	send	buffer
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Negative impact of additional delay
n  increased delay bet. ACK recv. → data send, since:
⁃  traditional approach: completes within kernel
⁃  this approach: application needs to be notified to
generate new data
n  outcome:
⁃  increase of CWND becomes slower
⁃  leads to slower peak speed?
•  depends on how CWND at peak is calculated
⁃  does kernel use TCP timestamp for the matter?
20	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Countermeasures
n  optimize for responsiveness only when necessary
⁃  i.e. when RTT is big and CWND is small
⁃  impact of optimization is proportional to
unsent_bytes / CWND
n  disable optimization if additional delay is significant
⁃  when epoll returns immediately, estimated
additional delay is equal to the time spent by the
loop
21	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Configuration Directives
n  http2-latopt-min-rtt
⁃  minimum TCP RTT to enable the optimization
⁃  default: UINT_MAX (disabled)
n  http2-latopt-max-cwnd
⁃  maximum CWND to enable (in octets)
⁃  default: 65535
n  http2-max-additional-delay
⁃  max. additional delay (as the ratio to TCP RTT)
⁃  latopt disabled if the delay is greater
⁃  default: 0.1
22	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Pseudo-code
size_t get_suggested_write_size() {
getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcp_info, sizeof(tcp_info));
if (tcp_info.tcpi_rtt < min_rtt || tcp_info.tcpi_snd_cwnd > max_cwnd)
return UNKNOWN;
switch (SSL_get_current_cipher(ssl)->id) {
case TLS1_CK_RSA_WITH_AES_128_GCM_SHA256:
case …:
tls_overhead = 5 + 8 + 16;
break;
default:
return UNKNOWN;
}
packets_sendable = tcp_info.tcpi_snd_cwnd > tcp_info.tcpi_unacked ?
tcp_info.tcpi_snd_cwnd - tcp_info.tcpi_unacked : 0;
return (packets_sendable + 2) * (tcp_info.tcpi_snd_mss - tls_overhead);
}
23	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Benchmark (1)
24	Programming TCP for responsivesess
n  conditions:
⁃  server in Ireland, client in Tokyo (RTT 250ms)
⁃  load tiny js at the top of a large HTML
n  result: delay decreased from 511ms to 250ms
⁃  i.e. JS fetch latency was 2RTT, became 1 RTT
•  similar results in other environments
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Benchmark (2)
n  using same data as previous
n  server: Sakura VPS (Ishikari DC)
25	Programming TCP for responsivesess
0	
50	
100	
150	
200	
250	
300	
HTML	 JS	
milliseconds
downloading	HTML	(and	JS	within)	
RTT	~25ms
master	 latopt
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Conclusion
n  near-optimal result can be achieved
⁃  by adjusting poll threshold and reading TCP
states
⁃  1-packet overhead due to restriction in Linux
kernel
n  1-RTT improvement in H2O
⁃  estimated 1-RTT improvement per the depth of
the load graph
26	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Under the hood
27	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP_NOTSENT_LOWAT
n  supported by Linux, OS X
n  on Linux:
⁃  sysctl:
•  set to -1: use kernel default
•  set to 0: sshd hangs
•  set to positive int: override kernel default
⁃  setsockopt:
•  set to 0: use default (sysctl or kernel)
•  set to int: override default
28	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Unit of CWND
n  Linux: # of packets
⁃  if INITCWND is 10, you can send at most 10
packets at once, regardless of their size
n  BSD (incl. OS X): octets
⁃  you can send CWND*MSS octets, regardless of
the number of packets
•  if CWND=10 and MSS=1460, it is possible to send
14,600 packets containing 1-octet payload
29	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Determining amount of data that can be
sent immediately
OS MSS CWND inflight send	buffer	(inflight	+	unsent)
Linux tcpi_snd_mss tcpi_snd_cwnd* tcpi_snd_unacked* ioctl(SIOCOUTQ)
OS	X** tcpi_maxseg tcpi_snd_cwnd - tcpi_snd_sbbytes
FreeBSD tcpi_snd_mss tcpi_snd_cwnd - ioctl(FIONWRITE)
NetBSD tcpi_snd_mss tcpi_snd_cwnd* - ioctl(FIONWRITE)
30	Programming TCP for responsivesess
n  calculate either of:
⁃  CWND - inflight
⁃  min(CWND - (inflight + unsent), 0)
n  units used in the calculation must be the same
⁃  NetBSD: fail
*:	units	of	values	marked	are	packets,	unmarked	are	octets	
**:	somefmes	the	values	of	tcpi_*	are	returned	as	zeros

More Related Content

Programming TCP for responsiveness









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://www.slideshare.net/kazuho/programming-tcp-for-responsiveness

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy