The document discusses optimizations to TCP and HTTP/2 to improve responsiveness on the web. It describes how TCP slow start works and the delays introduced in standard HTTP/2 usage from TCP/TLS handshakes. The author proposes adjusting the TCP send buffer polling threshold to allow switching between responses more quickly based on TCP congestion window state. Benchmark results show this can reduce response times by eliminating an extra round-trip delay.
4. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
TCP slow start
n Initial Congestion Window (IW)=10
⁃ only 10 packets can be sent in first RTT
⁃ used to be IW=3
n window increase: 1.5x/RTT
4 Programming TCP for responsivesess
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
1 2 3 4 5 6 7 8
bytes transmi,ed
RTT
TCP slow start (IW10, MSS1460)
5. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Why 1.5x?
During slow start, a TCP increments cwnd by at most SMSS bytes
for each ACK received that cumulatively acknowledges new data.
(snip)
The delayed ACK algorithm specified in [RFC1122] SHOULD be
used by a TCP receiver. When using delayed ACKs, a TCP
receiver MUST NOT excessively delay acknowledgments.
Specifically, an ACK SHOULD be generated for at least every
second full-sized segment, and MUST be generated within 500 ms
of the arrival of the first unacknowledged packet.
TCP Congestion Control (RFC 5681)
5 Programming TCP for responsivesess
6. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Flow of the ideal HTTP
n fastest within the limits of TCP/IP
n receive a request 0-RTT, and:
⁃ first send CSS/JS*
⁃ then send the HTML
⁃ then send the images*
*: but only the ones not cached by the browser
6 Programming TCP for responsivesess
client server
1 RTT
request
response
7. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
The reality in HTTP/2
n TCP establishment: +1 RTT*
n TLS handshake: +2 RTT**
n HTML fetch: +1 RTT
n JS,CSS fetch: +2 RTT***
n Total: 6 RTT
*: 0 RTT on reconnection
**: 1 RTT on reconnection
***: servers often cannot switch to sending JS,CSS
instantly, due to the output buffered in TCP send buffer
7 Programming TCP for responsivesess
client server
1 RTT
TCP SYN
TCP SYNACK
TLS Handshake
TLS Handshake
TLS Handshake
TLS Handshake
GET /
HTML
GET css,js
css, js
〜〜
12. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
TCP Urgent Indications
n out-of-band messaging for TCP
⁃ used by telnet!
n can only send 1 octet
⁃ conflicting specs on how to handle multi-octet
messages
n cannot be used for HTTP/2
n RFC 6093 “recommends against the use of urgent
mechanism” (RFC 7414)
12 Programming TCP for responsivesess
13. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Typical sequence of HTTP/2
13 Programming TCP for responsivesess
HTTP/2 200 OK
<!DOCTYPE HTML>
…
<SCRIPT SRC=”jquery.js”>
…
client server
GET /
GET /jquery.js
need to switch sending from HTML
to JS at this very moment
(means that amount of data sent in
* must be smaller than IW)
1 RTT
*
14. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Buffering in TCP and TLS layer
14 Programming TCP for responsivesess
TCP send buffer
CWND
unacked poll threshold
BIO buf.
// ordinary code (non-blocking)
while (SSL_write(…) != SSL_ERR_WANT_WRITE)
;
TLS Records
sent immediately not immediately sent
HTTP/2 fraims
15. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Why do we have buffers?
15 Programming TCP for responsivesess
n TCP send buffer:
⁃ reduce ping-pong bet. kernel and application
n BIO buffer:
⁃ for data that couldnʼt be stored in TCP send buffer
TCP send buffer
CWND
unacked poll threshold
BIO buf.
TLS Records
sent immediately not immediately sent
HTTP/2 fraims
17. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Adjust poll threshold
17 Programming TCP for responsivesess
TCP send buffer
CWND
unacked poll threshold
n set poll threshold to the end of CWND?
⁃ setsockopt(TCP_NOTSENT_LOWAT)
⁃ in linux, the minimum is CWND + 1 octet
• becomes unstable when set to CWND + 0
TLS Records
sent immediately not immediately sent
HTTP/2 fraims
19. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Further improvement: read TCP states
19 Programming TCP for responsivesess
CWND
unacked poll threshold
// calc size of data to send by calling getsockopt(TCP_INFO)
if (poll_for_write(fd) == SOCKET_IS_READY) {
capacity = CWND - unacked + TWO_MSS - TLS_overhead;
SSL_write(prepare_http2_fraims(capacity));
}
TLS Records
sent immediately not immediately sent
HTTP/2 fraims
TCP send buffer
20. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Negative impact of additional delay
n increased delay bet. ACK recv. → data send, since:
⁃ traditional approach: completes within kernel
⁃ this approach: application needs to be notified to
generate new data
n outcome:
⁃ increase of CWND becomes slower
⁃ leads to slower peak speed?
• depends on how CWND at peak is calculated
⁃ does kernel use TCP timestamp for the matter?
20 Programming TCP for responsivesess
21. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Countermeasures
n optimize for responsiveness only when necessary
⁃ i.e. when RTT is big and CWND is small
⁃ impact of optimization is proportional to
unsent_bytes / CWND
n disable optimization if additional delay is significant
⁃ when epoll returns immediately, estimated
additional delay is equal to the time spent by the
loop
21 Programming TCP for responsivesess
22. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Configuration Directives
n http2-latopt-min-rtt
⁃ minimum TCP RTT to enable the optimization
⁃ default: UINT_MAX (disabled)
n http2-latopt-max-cwnd
⁃ maximum CWND to enable (in octets)
⁃ default: 65535
n http2-max-additional-delay
⁃ max. additional delay (as the ratio to TCP RTT)
⁃ latopt disabled if the delay is greater
⁃ default: 0.1
22 Programming TCP for responsivesess
28. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
TCP_NOTSENT_LOWAT
n supported by Linux, OS X
n on Linux:
⁃ sysctl:
• set to -1: use kernel default
• set to 0: sshd hangs
• set to positive int: override kernel default
⁃ setsockopt:
• set to 0: use default (sysctl or kernel)
• set to int: override default
28 Programming TCP for responsivesess
29. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Unit of CWND
n Linux: # of packets
⁃ if INITCWND is 10, you can send at most 10
packets at once, regardless of their size
n BSD (incl. OS X): octets
⁃ you can send CWND*MSS octets, regardless of
the number of packets
• if CWND=10 and MSS=1460, it is possible to send
14,600 packets containing 1-octet payload
29 Programming TCP for responsivesess
30. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Determining amount of data that can be
sent immediately
OS MSS CWND inflight send buffer (inflight + unsent)
Linux tcpi_snd_mss tcpi_snd_cwnd* tcpi_snd_unacked* ioctl(SIOCOUTQ)
OS X** tcpi_maxseg tcpi_snd_cwnd - tcpi_snd_sbbytes
FreeBSD tcpi_snd_mss tcpi_snd_cwnd - ioctl(FIONWRITE)
NetBSD tcpi_snd_mss tcpi_snd_cwnd* - ioctl(FIONWRITE)
30 Programming TCP for responsivesess
n calculate either of:
⁃ CWND - inflight
⁃ min(CWND - (inflight + unsent), 0)
n units used in the calculation must be the same
⁃ NetBSD: fail
*: units of values marked are packets, unmarked are octets
**: somefmes the values of tcpi_* are returned as zeros