Securing VoIP - A Framework To Mitigate or Manage Risks
Securing VoIP - A Framework To Mitigate or Manage Risks
Securing VoIP - A Framework To Mitigate or Manage Risks
Research Online
12-4-2007
Andrew Woodward
Edith Cowan University
DOI: 10.4225/75/57b54a2fb875b
5th Australian Information Security Management Conference, Edith Cowan University, Perth Western Australia,
December 4th 2007
This Conference Proceeding is posted at Research Online.
https://ro.ecu.edu.au/ism/33
Proceedings of The 5th Australian Information Security Management Conference
Abstract
In Australia, the past few years have seen Voice over IP (VoIP) move from a niche communications medium
used by organisations with the appropriate infrastructure and capabilities to a technology that is available to
any one with a good broadband connection. Driven by low cost and no cost phone calls, easy to use VoIP
clients and increasingly reliable connections, VoIP is replacing the Public Switch Telephone Network (PSTN) in
a growing number of households. VoIP adoption appears to be following a similar path to early Internet
adoption, namely little awareness by users of the security implications. Lack of concern about security by VoIP
users is probably due to the relatively risk free service provided by the PSTN. However, VoIP applications use
the Internet as their communications medium and therefore the risk profile is significantly different to the PSTN.
This paper reviews the risks for two VoIP implementation models now being increasingly used in Australian
homes; the PC softphone and the Analogue Telephony Adaptor (ATA). An overview of each of the VoIP
implementation models is given together with a description of the respective technologies and protocols utilised.
The VoIP security threats, applicable to the two VoIP implementation models considered, are enumerated and
vulnerabilities that could be exploited are considered. Available security mechanisms that address the
identified vulnerabilities are discussed. A practical and pragmatic VoIP security framework is proposed that
will enable a user to mitigate or manage the risks associated with using the VoIP implementation models
considered. By applying the VoIP security framework a user will be able to deploy a secure VoIP solution
appropriate for residential use.
Keywords
VoIP Security, softphone, analogue telephony adapter (ATA), VoIP threats and vulnerabilities, Risk
management
INTRODUCTION
Voice over Internet Protocol (VoIP) is the packaging and routing of voice conversations over an IP-based
network, e.g. the Internet. VoIP products have been commercially available for many years but were initially
confined to distributed organisations with internal high speed IP networks looking to reduce the cost of intra-
company telephone calls. VoIP needs fast IP networks with high throughput, without which VoIP can suffer
from poor latency (caused by delays in the time taken for VoIP packets to go from source to destination), jitter
(caused by VoIP packets arriving out of sequence) and packet loss. While residential access to the Internet was
via dial-up modems VoIP was not feasible. However, the phenomenal growth of residential broadband
connectivity has enabled good quality and reliable VoIP services to be available to households.
Most VoIP implementations, like many Internet services, were designed using available standard protocols; with
security not being a key design criterion. Adding security to a solution, rather than including it at the design
stage, typically leads to an inefficient and sometimes ineffective solution, i.e. adding security functionality can
impact a solution’s performance and because the security functionality is not integral to the design it may be
possible to bypass the functionality. Any impact on performance can be critical to the correct operation of a
VoIP solution, i.e. poor latency, jitter and packet loss can occur. Therefore, to be used, VoIP security
mechanisms must not impact the performance of VoIP sessions.
A Softphone is application software that runs on a PC and allows telephone calls to be made over an IP network.
A Softphone requires no dedicated hardware, it uses the PC soundcard for voice input and output. Whilst a
PC’s speakers and a microphone can be used, the best results are achieved with a headset, with integrated
microphone, or a USB attachable hand phone. There are numerous Softphone applications available; the most
popular include Skype, Gizmo and KPhone (distributed with KDE – K Desktop Environment). In Australia,
Softphone use started to gain momentum from 2003.
Page 103
Proceedings of The 5th Australian Information Security Management Conference
An Analog Telephony Adaptor (ATA), also known as a VoIP adaptor, is a hardware unit that is positioned in-
line between an analog telephone and a PSTN line, and connected to an IP network with a broadband modem.
Residential use of ATAs in Australia started to gain momentum in late 2004 with rapid growth occurring from
mid 2006. An ATA can be packaged in a number of ways including:
• A small box with only ATA functionality: This box, in addition to being connected in-line between a
phone and telephone line, must also be connected to an IP network with a broadband modem/router.
• Broadband Modem/Router: ATA functionality is packaged together with a broadband modem/router;
both wired and wireless ATA broadband modem/routers are available.
• VoIP Phone: The ATA functionality is packaged within a telephone. The VoIP telephone is connected
to both the telephone line and an IP network with a broadband modem/router.
Softphones and ATAs suffer from both common and different security threats; threats can include
eavesdropping, denial of service (DoS) and toll fraud. VoIP threats arise because of vulnerabilities in the VoIP
technology and/or due to how the technology is configured and deployed. The security framework proposed in
this paper identifies the security mechanisms and techniques to prevent the vulnerabilities from being exploited
and enable a VoIP solution to be deployed that is appropriate for residential use.
The term User Agent (UA) is used throughout this paper to refer to both a Softphone and ATA.
PSTN Phone
ATA
IP
LAN
Page 104
Proceedings of The 5th Australian Information Security Management Conference
The following set of scenarios show how calls made from and to a residence via either the PSTN or the Internet
are handled by an ATA:
9. Call from residence to a phone connected to the PSTN or a UA on the same VoIP network.
• If the (calling) residence broadband connection is enabled the ATA routes the call onto
the LAN and out on to the Internet via the broadband router.
• If the (calling) residence broadband connection is not enabled the ATA routes the call
onto the PSTN phone line connected to the ATA, i.e. VoIP is not used to send the call.
10. Call received by residence from a phone connected to the PSTN using the residence PSTN number:
• The ATA routes the call from the PSTN line straight to the phone, nothing goes
through the residence LAN. Calls are received irrespective of whether the residence
LAN is working.
11. Call received by residence from a phone connected to the PSTN using the residence VoIP number:
• The call is received over the Internet (as it has been routed via a PSTN Gateway to a
proxy server at VoIP service provider, see Figure 2 below) and therefore the ATA
routes the call from the IP connection to the phone. If the residence broadband is not
enabled the call will not be received.
12. Call received by residence from another UA on the same VoIP network:
• If the call is made using the PSTN number the call will be routed on to the Internet and
then routed on to the PSTN network (via a PSTN Gateway). The call will be received
as per scenario 2.
• If the call is made using the VoIP number the call will be routed over the Internet and
through the ATA to the phone.
In this paper where an ATA implementation (product) and VoIP service provider are required as a point of
reference the Sipura/Linksys ATA product (Sipura Technology Inc.2004) and the Engin1 VoIP service (Engin
Website 2007) are used.
To enable a concise and consistent analysis to be performed, the two VoIP implementation models considered in
this paper utilise the same protocol/technology set; and therefore have the same architecture and concept of
operation. The protocol/technology set selected is the set most commonly used in VoIP implementations. The
Softphone and ATA products selected as reference points (with the exception of Skype) generally conform to
the nominated architecture and protocol/technology set.
Concept of Operation – VoIP Network
VoIP operation is shown using two conceptual models. Figure 2 presents a network model of the different types
of ‘node’ that can exist in a VoIP network while Figure 3 models the sequence of events involved in establishing
a VoIP telephone call.
The network model (Figure 2) shows a number of residences, each with different telephony capabilities, and the
core elements provided by a VoIP service provider. A description of each of the nodes in the network model is
also given.
VoIP Service Provider: The core elements to enable a quality VoIP service to be provided include:
• Provisioning Server: This server provides configuration details to an ATA when the ATA is first
connected to a VoIP network.
• Proxy Server: The proxy provides two services; registration acceptance and proxying call requests and
responses between UAs. When a UA registers it is informing the proxy of its IP address and the port
number it can be reached on. Registration is for a finite period and therefore the UA periodically
renews registration. When a UA initiates a call it sends a request to the proxy server with details of the
intended recipient. The proxy requests the recipient’s details from the Location Server.
• Billing Server: A billing server generates billing data for chargeable calls.
• Location Server: A location server manages a database of VoIP IP addresses and also contains rules
on the most appropriate PSTN Gateway to use to route calls destined for the PSTN.
1
Engin is currently the largest VoIP broadband service provider in Australia
Page 105
Proceedings of The 5th Australian Information Security Management Conference
• PSTN Gateway: A PSTN Gateway provides an interface to the PSTN to allow a UA to send and
receive phone calls from the PSTN. The PSTN Gateway essentially behaves like a UA.
Page 106
Proceedings of The 5th Australian Information Security Management Conference
17. The conversation is performed in a ‘peer-to-peer’ like format, i.e. transmission appears to be
direct from UA to UA – in practice transmission is via the VoIP service provider (and/or Internet
service provider, if the VoIP and Internet service providers are different).
1. X initiates
call to Y 3. Call
from Y
Location
server
2. Obtain
Y Address
Page 107
Proceedings of The 5th Australian Information Security Management Conference
An overview of each of the application and transport layer protocols is given below:
Session Initialisation Protocol: There are a number of protocols available for VoIP call initialisation but SIP is
the most popular. SIP can be used with a range of transport protocols but provides a good VoIP solution when
coupled with RTP and UDP. SIP is used essentially to introduce the calling parties (UAs) and to inform which
IP addresses they are using. SIP requires proxy servers and location servers to perform address resolution, once
a connection is established the UAs perform the actual transmission; as shown in Figure 3, i.e. the voice traffic
can flow without passing through the proxy servers. A good overview of the application SIP to VoIP can be
found in the NIST guidelines (Kuhn et al 2005). SIP uses a default port.
Real Time Control Protocol: RTCP is an application layer protocol used by a UA to assist Quality of Service
(QoS) delivery. RTCP does not transport data; its purpose is to supply statistical information to a UA to allow a
UA to change parameters and settings to improve QoS.
Real-time Transport Protocol: RTP provides a packet format for delivering VoIP traffic. RTP packets contain
the information required to reassemble the packets, upon delivery, into a voice signal. To enable RTP packets to
be transmitted across ordinary network nodes on the Internet they are carried as UDP datagrams.
User Datagram Protocol: UDP is one of the key protocols in the IP suite. UDP tends to be used for time
sensitive applications and therefore is suitable for VoIP applications; UDP can deliver data faster than other
network layer protocols as it does not perform comprehensive transmission error checking.
Resource Reservation Protocol: RSVP is a transport layer control protocol. RSVP does not transport data; it
provides QoS information to hosts and routers to enable high quality VoIP sessions to be delivered.
It should be noted that whilst Skype is occasionally referenced as an example product in this paper it does not
conform to the protocols defined in the application and transport layers in the above architectural model. Skype
uses a proprietary and unpublished protocol set.
Page 108
Proceedings of The 5th Australian Information Security Management Conference
• Cyber Attacks: Like malicious software, cyber attacks can exploit exposed PCs (running Softphones)
and ATAs due to router/firewall tunnels/open ports on broadband routers and/or firewalls required to
achieve QoS.
• Toll Fraud: Breaking into (‘hacking into’) a VoIP network can allow an attacker to initiate outbound
calls. Whilst this threat is more likely to be exploited in corporate networks than residential, such
attacks can occur.
The identified VoIP threats arise due to opportunities that exist to exploit vulnerabilities (Tucker 2004) in the
deployed VoIP technology and/or the poor configuration of the VoIP technology and its underlying network
infrastructure. Some of the key vulnerabilities in the VoIP implementation model considered in this paper
include:
Lack of Signal Confidentiality: As shown in Figure 3, when a call is initiated SIP communicates with proxy
and location servers (hosted by the VoIP service provider) to establish the call. SIP sends and receives the call
parameters in plain text. It is therefore possible to perform network traffic analysis to monitor calls and enable
eavesdropping to be performed.
Lack of Dialogue Confidentiality: Once a call is established, RTP carries the conversation as a peer to peer like
transmission (see Figure 3, event 5). RTP does not encrypt its payload therefore it is possible to eavesdrop on
the conversation.
End Point to End Point Security: While both SIP and RTP in their standard mode of operation do not provide
encryption capabilities both protocols can be configured, using certain Internet encryption standards, to preserve
the confidentiality of both VoIP signal and dialogue. However, for an encrypted VoIP stream to occur it
requires both UA’s to have the UAs encryption capabilities enabled and for the VoIP service provider to support
encryption. In practice, the use of encryption is more often only possible between identical Softphones.
Softphone to PSTN phone encryption is obviously not possible, and as can be seen below, Softphone/ATA to
ATA encryption via a commercial VoIP service provider is unlikely to be available (Enquiry about Engin VoIP
Security 2007).
Lack of an Encryption Service from VoIP Service Providers: Information provided by Engin (Enquiry about
Engin VoIP Security 2007), currently Australia’s largest VOIP service provider, demonstrates that commercial
VoIP service providers are not encrypting data from UAs. Engin provided advice on how encryption could be
performed for both Softphone and Sipura/Linksys ATA (Sipura Technology Inc 2004), but neither encryption
capability was supported by Engin.
End Point Security – Lack of Boundary Security: Many residences connect to the Internet using a basic
broadband modem. Lack of any boundary security, (i.e. no firewall or router with firewall capabilities) results
in the residence UA being vulnerable to network attacks.
End Point Security – Network Address Translation: As shown in Figures 2 & 3, SIP requires a Proxy Server
to enable a call to be established. The SIP proxy server also requires the actual IP address of the UA. If the UA
is positioned behind a router or firewall that supports Network Address Translation (NAT) then problems
initiating a call occur. NAT is the technique of allowing one IP address to masquerade as a number of addresses
behind a router/firewall; each device behind the firewall has its own address that is not visible to the Internet, i.e.
the UA’s IP address cannot be seen by the proxy server. For SIP to enable the proxy server to communicate
with the UA, port forwarding rules (also known as tunnelling) need to be established for the router/firewall if the
router/firewall does not support SIP traversal functionality. Port forwarding creates a tunnel that allows network
traffic to pass straight through a router/firewall to a specific IP address, i.e. a hole is created in the firewall
making it vulnerable to attack. Attackers may be able to exploit the hole/tunnel in the router/firewall to access
the residence network, particularly as some VoIP service providers will not specify the source IP address (Engin
Website 2007) of proxy servers resulting in the firewall port forwarding rule having to be set to allow
forwarding from “ALL” external IP addresses.
Lack of PC Protection: A Softphone should be used on a PC with both an up-to-date operating system, (i.e. the
latest vendor patches and updates have been installed) and up-to-date anti-virus and anti-spyware software
otherwise it will be vulnerable to malicious software.
Possible Flaws in Security Enhanced Protocols: Ongoing work to improve the capabilities of protocols like
SIP and RTP have resulted in security enhanced versions becoming available. When security is added as an
afterthought, (i.e. security was not an initial key design criterion and therefore is unlikely to be pervasive
throughout the implementation) exploitable vulnerabilities can be introduced.
Page 109
Proceedings of The 5th Australian Information Security Management Conference
Page 110
Proceedings of The 5th Australian Information Security Management Conference
It becomes a trade-off between the application of security mechanisms and the quality of VoIP service that can
be delivered. A number of security mechanisms are available to improve the security of VoIP. The
mechanisms considered below are the security mechanisms most suitable for the two VoIP implementation
models presented in this paper. An overview of each mechanism is given to provide an appropriate reference
point in the security framework; if the mechanism is used as a risk counter measure.
Page 111
Proceedings of The 5th Australian Information Security Management Conference
avoids the use of port forwarding. One of two different mechanisms that can be used to avoid port forwarding
are:
• Siproxd is a router/firewall capability that allows a UA to work behind the router/firewall that is
implementing NAT. Siproxd allows UAs behind a firewall to register with the VoIP service provider
proxy. When initiating a call Siproxd rewrites SIP message bodies to function within a NAT
environment; or
• Simple Traversal of UDP over NATs (STUN) is a client server protocol that allows a UA to find out
its public address and the Internet-side port associated by NAT, with a particular local port. To enable
SIP messages to traverse firewalls running NAT, both the UA and the VoIP service provider’s proxy
server to need to support STUN.
Page 112
Proceedings of The 5th Australian Information Security Management Conference
to perform encryption. Unless a user can identify a VoIP service provider who supports encryption for ATAs,
the only option to keep a VoIP conversation private is through a Softphone to Softphone connection. Selecting
one of the following will enable an encrypted and private conversation to be performed (Table 4)
Table 3: Suggestions for securing infrastructure for VoIP activities and their associated threats
Activity Description Threats Addressed
Activity 1: Install A good quality firewall should be installed that supports the DoS, Cyber Attacks,
Strong Firewall following features: Toll Fraud.
• Stateful packet filtering
• Network address translation
• Traffic shaping to prioritise VoIP traffic
• Siproxd.
Activity 2: Position If the UA is an ATA it should be positioned on a separate Cyber attack, Toll
UA on Separate network segment, i.e. configure a separate network port on the Fraud
Network Segment firewall with its own subnet addressing range and attach the
ATA. Where possible a Softphone should be placed on a
separate network segment, however it is recognised that in
practice a Softphone will reside on a multi-purpose PC and
therefore separating voice and data traffic may not be possible.
Activity 3: Harden Where a Softphone is used as a UA the PC on which Softphone Malicious Software
Softphone PC resides should be hardened by:
• Application of all patches and updates (as soon as they
become available) for the PC operating system,
Internet applications and Softphone.
• Use of up to date anti-virus and anti-spyware software.
Activity 4: NAT provides an important security feature, i.e. the ability to Cyber attacks, Toll
Accommodate SIP masquerade PCs and devices behind a firewall. However, Fraud
with NAT because SIP has problems traversing a firewall running NAT
either of the following mechanisms will allow SIP to be
accommodated within a NAT environment:
• Use a firewall with Siproxd; or
• Use UAs that support STUN.
Table 4: Suggestions preserving confidentiality and integrity for VoIP activities and their associated threats
Activity Description Threats Addressed
Activity 1: Use Select a Softphone (probably a Softphone recommended by the Lack of Signal
Encrypting VoIP service provider) that uses SRTP to secure a VoIP Confidentiality, Lack
Softphone conversation, e.g. KPhone utilises SRTP. The VoIP service of Dialogue
provider will need to support SRTP and therefore have the Confidentiality, End
infrastructure to issue public/private keys and certificates. Both Point to End Point
parties will need to have been issued with keys/certificates Security.
before the an encrypted dialogue can occur.
Activity 2: Use ZRTP allows a Softphone to perform encryption without the Lack of Encryption
ZRTP with need for a PKI (Muncan 2006). Both parties need to use a Service from VoIP
Softphone SRTP Softphone with ZRTP integrated. ZRTP will work with Service Provider
different types of Softphone, i.e. a caller can be using a
different type of Softphone to the receiver.
Activity 3: Use Skype is the largest VoIP service provider in the world. A Lack of Encryption
Skype Skype to Skype conversation is encrypted. Service from VoIP
Service Provider
Page 113
Proceedings of The 5th Australian Information Security Management Conference
Page 114
Proceedings of The 5th Australian Information Security Management Conference
CONCLUSION
The increasing use of VoIP technology in Australian homes, as well as elsewhere in the world, creates a
potential new category of attack vectors. Decreased call costs, the ability to make calls from different places
mean that VoIP is not likely going away any time soon. Additionally, most ISPs in Australia are now bundling
VoIP services with home broadband plans. The security implications and risk of using such technology must be
considered by all users, not just corporations, as the risks will apply equally to all. This paper has proposed a
framework that can be implemented by an IT literate user to mitigate and manage the VoIP security risks. It has
been shown that some relatively simple measures can be applied that will reduce the security risks when using
VoIP in a residential environment.
Future work would look at expanding this framework and using it to create guidelines that could be followed by
users who lack the technological background to implement adequate security. Also of use would be to examine
how well current users understand the risks of using VoIP, and also at ways in which general SOHO users can
be made more aware of the risks.
REFERENCES
Engin Website, URL http://www.engin.com.au, Accessed 13 September 2007.
Enquiry about Engin VoIP Security (2007), Question & Answer email to Engin Support Desk, October 2007.
Kuhn, D.R. Walsh, T.J. & Fries,S. (January 2005). Security Considerations for Voice Over IP Systems:
Recommendations of the National Institute of Standards and Technology. Department of Commerce,
National Institute of Standards and Technology.
Garfinkel, S. L. (2005) VoIP and Skype Security, Computer Science & Artificial Intelligence Laboratory, MIT.
Hung, P. C. K. and M. V. Martin (2006) Through the Looking Glass: Security Issues in VOIP Applications,
University of Ontario Institute of Technology.
Internet Security Systems Inc. (2004) VoIP: The Evolving Solution and the Evolving Threat.
Kphone Website, URL http://sourceforge.net/projects/kphone, Accessed 12 September 2007.
McLaughlin, L. (2006). Philip Zimmermann on What's Next after PGP. IEEE Security & Privacy: 10 - 13.
Muncan, M. (2006) Secure telephony: SIP/SRTP (PKI) vs. Zfone vs. Skype, Universitat Konstanz.
Orrblad, J. (2005). Alternatives to MIKEY/SRTP to secure VoIP. Telecommunication Systems Laboratory
(TSLab), Department of Microelectronics and Information Technology (IMIT). Stockholm/Kista, Royal
Institute of Technology (KTH)
Simon, M. and J. Slay (2006) Voice over IP: Forensic Computing Implications, Enterprise Security
Management Lab, University of South Australia.
Singhai, R. and A. Sahoo (2006) VoIP Security, School of Information Technology, Indian Institute of
Technology, Bombay.
Sipura Technology Inc. (2004) Implementing Residential Voice over Broadband Services with the Sipura Phone
Adaptor (SPA).
Skype (2007) About Skype URL http://www.skype.com, Accessed on 23 September 2007.
Tucker, G. S. (2004) Voice Over Internet Protocol (VoIP) and Security, SANS Institute.
Walsh, T. J. and D. R. Kuhn (2005). "Challenges in Securing Voice over IP." IEEE Security & Privacy: 44-49.
Wikipedia (2007) Comparison of VoIP software. URL,
http://en.wikipedia.org/wiki/Comparison_of_VoIP_software Accessed 1 October 2007
COPYRIGHT
Peter James and Andrew Woodward ©2007. The author/s assign SCISSEC & Edith Cowan University a non-
exclusive license to use this document for personal use provided that the article is used in full and this copyright
statement is reproduced. The authors also grant a non-exclusive license to SCISSEC & ECU to publish this
document in full in the Conference Proceedings. Such documents may be published on the World Wide Web,
Page 115
Proceedings of The 5th Australian Information Security Management Conference
CD-ROM, in printed form, and on mirror sites on the World Wide Web. Any other usage is prohibited without
the express permission of the authors.
Page 116