VOIP Introduction and Challenges by Shawky Menisy

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 5

VOIP

Shawky M. Menisy
Department of Wireless and Networks
Nile University
Cairo, Egypt
AbstractVOIP is an emerging technology for voice
transmission that is expected to replace the old voice
transmission technique; this document provides a highlight about
VOIP concept, protocol stack, quality of service (QoS) and the
new concept quality of experience (QoE). The document starts by
providing some insight into the principal building blocks of a
VoIP application, details about VOIP protocol stack and available
protocol alternatives that can be used, then the sources of
impairments over data IP networks are identified and
distinguished from signal-oriented sources of quality degradation
observed over telecom networks. The difference between the
terms quality of service (QoS) and the new concept quality of
experience (QoE) is then identified, with a highlight of the
different voice codecs available.
Index TermsVOIP, Protocol stack, Quality of Service (QoS),
Quality of Experience (QoE), RTP, H.323, SIP, Voice codec.

I. INTRODUCTION
Voice over Internet Protocol (VoIP) is a form of
communication that allows you to make phone calls over a
broadband packet connection instead of typical analog
telephone lines.
VoIP is becoming an attractive communications option for
consumers. Given the trend towards lower fees for basic
broadband service and the availability of faster internet
connection. From service provider point of view VOIP has a
better BW efficiency than traditional voice calls. Thus more
profit.
The voice over IP (VoIP) protocol suite is generically
broken into two categories, control plane protocols and data
plane protocols. The control plane portion of the VoIP protocol
is the traffic required to connect and maintain the actual user
traffic. It is also responsible for maintaining overall network
operation (router to router communications). The data plane
(voice) portion of the VoIP protocol stack is the actual traffic
that needs to get from one end to another.
The QoS on VoIP network depend on many factors as
delay, jitter, throughput, packet loss and the voice codec used.
Indeed, the interpretation as good or bad of a QoS metric can
be confusing because it ignores its concrete effect on clients;
Moreover, the quality of experience QoE can be deemed as
fine in spite of a slow response time, high jitter, or packet loss.
The ITU-T Study Group12 (SG12) is focused on Performance,
QoS and QoE and defines a large number of high
priority questions that are being addressed in

collaboration with several ICT (Information and


Communication Technologies), intra-partners and
inter- partners, e.g., SG16, SG13, ETSI, 3GPP, and
IETF.
II. VOIP CONCEPT
Voice over IP (VoIP) involves digitization of voice streams
and transmitting the digital voice as packets over conventional
IP-based packet networks like the Internet, Local Area
Network (LAN), wireless LAN (WLAN) or mobile network
(LTE). Although the quality of VoIP does not yet match the
quality of a circuit-switched telephone network, there is an
abundance of activity in developing protocols and speech
encoders for the implementation of the high quality voice
service.
The digitalization process is composed of sampling,
quantization, compression and encoding; Afterward the
encoded speech is then packetized into packets of equal size.
- Quantization is the procedure of constraining something from
a continuous set of values (such as the real numbers) to a
relatively small discrete set; Encoder is usually contain all the 4
process sampling, quantization, compression and encoding -.
There are many encoding techniques that have been developed
and standardized by the ITU such as G.711, G.723.1, and
G.729. These codecs differ in their coding rate (bps), frame
rate (frames/s), algorithmic latency that will influence the
speech quality or Mean Opinion Source (MOS) in a VoIP
network.

Fig.1 VOIP Concept

III. VOIP PROTOCOL STACK


VOIP protocol stack is shown below, VOIP utilize IP as its
basic transport method.
IP is responsible for the delivery of packets (or datagram)
between host computers. IP is a connectionless protocol and it
does not establish a virtual connection through a network prior
to commencing transmission because this is the task of higher
level protocols. IP makes no guarantees concerning reliability,
flow control, error detection or error correction. The result is
that datagram could arrive at the destination computer out of
sequence, with errors or not even arrive at all.
Both TCP (Transmission Control Protocol) and UDP (User
Datagram Protocol) are used in VOIP, which enable the
transmission of information between the correct processes (or
applications) on host computers; normally the TCP protocol is
used for controls and UDP is used for data transfer as the voice
data cannot bear the delay of the TCP protocol and

Fig.2 VOIP Protocol Stack

VOIP protocol stack is divided into control plane protocols


and data plane protocols.
A. Control Plane Protocols:
The control plane portion of the VoIP protocol is the
signaling traffic required to connect and maintain the actual
user traffic. It is also responsible for maintaining overall
network operation (router to router communications). There are
several different types of VOIP signaling available today,
including H.323, SIP, SCCP, MGCP, MEGACO and
SIGTRAN. The most common types are the H.323 and SIP.
This document will discuss those two types of signaling
protocols.
1) H.323:
H.323 is a standard that specifies the components, protocols
and procedures that provide multimedia communication
services such as real-time audio, video, and data
communications over packet networks, including Internet
Protocol (IP) based networks. This standard was developed by
the International Telecommunication Union Technology
Standardization Sector (ITU-T) for transmitting audio and
video over the Internet.
The overall protocol stack for H.323 (see below) is made
up of several parts. Each part is responsible for specific tasks
such as call setup and phone registration.

Fig.3 H.323 Protocol Stack

H.245 is the media control portion of the H.323 protocol


stack that establishes a logical channel for each call (endpoint
to endpoint). During H.245 negotiation, each endpoint
exchanges its capabilities and preferences. The choice of
CODEC for the call is part of this exchange.
H.225 represents the basic signaling messages that are used
when dealing the other number. For H.225, these messages
include setup, alerting, connect, call proceeding, release
complete, and facility messages that are based on the Q.931
signaling scheme as defined as below.
Setup Message that attempts to connect a call.
The calling party sends this message to the called
party.
Alerting Message that is sent from the called
party back to the calling party to let the caller
know that the far end is being alerted (ringing).
Connect Message that informs the calling party
that the called party has accepted the call. The
conversation can begin at this point.
Call Proceeding Message that informs the end
points that the call is up and running. Call
Proceeding messages are exchanged at specific
intervals during the call.
Release Complete Message that is sent by the
party (called or calling) who disconnects the call
first.
2) SIP:
Session Initiation Protocol (SIP) is designed to manage and
establish multimedia sessions, such as video conferencing,
voice calls, and data sharing. SIP is still in its early stages of
deployment and is a growing and evolving protocol standard.
This is the standard that many element manufacturers are using
to develop products.
There are several key features of SIP that make it so
attractive:
a) Multimedia SIP can have multiple media
sessions during one call, which means that users
can share a game, instant message (IM), and talk
at the same time.
b) It is a light protocol and is easily scalable, has
only a few request messages that it uses to connect
calls
c) URL addressing scheme Allows for number
portability that is physical location independent.

Addressing can be a phone number, an IP address,


or an e-mail address. The messages are very
similar to those used by the Internet (HTTP).
The two components that make up a SIP system include user
agents and network servers.
User Agents User agents represent the phone (user agent
client) and the server (user agent server).
The user agent client (UAC) initiates media calls. The user
agent server (UAS) responds to those requests for setup on
behalf of the UAC. The UAS is also responsible for finding the
destination UAC or intermediate UAS.
Network Servers Network servers include redirection,
proxy, and registrar servers. Redirection servers do not process
calls and only respond with information containing the
appropriate address of the next server. Proxy servers contain
features of both a client and a server. The proxy server can
receive requests and response messages. It can also adjust the
header information prior to forwarding the request to the next
proxy server or back to the user client. The registration server
registers new clients in the database and updates other
databases.
B. Data Plane Protocols:
The data plane (voice) portion of the VoIP protocol stack is
the actual traffic that needs to get from one end to another. Data
plane protocols are RTP, RTCP, cRTP, RTCP XR.
Real-Time Transport (RTP) protocol provides end-to-end
network transport functions suitable for applications
transmitting real-time data, such as audio, video or simulation
data, over multicast or unicast networks. Each RTP packet
contains a small sample of the voice conversation; the size of
the voice sample inside the packet will depend on the CODEC
used.

Fig.4 RTP position in the VOIP protocol stack

RTP information is encapsulated in a UDP packet. If an


RTP packet is lost or dropped by the network it will not be
retransmitted as we are using UDP. This is because a user
would not want a long pause or delay in the conversation,
therefore the VOIP network shall be designed so that a few
packets are lost in transmission.
RTP Header contains several pieces of information to identify
and manage each individual call from endpoint to endpoint,

which includes a timestamp, a sequence number, and


conversation synchronization source information.
Compressed RTP (cRTP) is a variant of RTP which
eliminate much of the overall packet header of the standard
RTP packet. With lower packet header a more efficient
bandwidth usage will be achieved, with a system running
cRTP, a user can place approximately twice as many calls as
compared to a system running standard RTP.
BUT Because the IP header is compressed with the UDP and
RTP headers down to a maximum of 4 bytes, leaving no room
for an IP address; therefore, the packet cannot be routed. It can
only be placed on a point-to-point link that requires no
addressing.

Fig.5 RTP packet vs. Compressed RTP

The RTP control protocol (RTCP) is a data plane


protocol used to monitor the quality of real-time services and to
convey information about participants in an on-going session.
There are components called monitors, which receive RTCP
packets sent by participants in a session. These packets contain
reception reports, and estimate the current quality of service for
distribution monitoring, fault diagnosis and long-term statistics.
RTCP also aids significantly in the troubleshooting of a voice
stream.
RTCP XR (RTP Control Protocol Extended Reports) is a
newer extension of the RTCP concept. It defines a set of
metrics that can be inexpensively added to call manger, call
gateways and IP phones for call quality analysis; RTCPXR
provides information on the following call quality metrics:
Packet Loss, Delay, SNR, Echo, Overall call quality,
Configuration Information (endpoint jitter, buffer size).
CODECs The primary functions of a voice codec are to
perform analog/digital voice signal conversion and digital
compression, there is a wide range of voice CODECs protocols
available for VOIP implementation. The most common voice
CODECs includes G.711, G.723, G.726, G.728, and G.729. A
brief description of each CODEC follows.
G7.11 Coverts voice into a 64 kbps voice stream. This is
the same CODEC used in traditional TDM T1 voice. It is
considered the highest quality.
G.723 Two different type of G.723 compression exist.
One type uses a Code Excited Linear Predication (CELP)

compression algorithm and has a bit rate 5.3kbps. The other


type uses as Multi-Pulse Maximum Likelihood Quantizer (MPMLQ) algorithm and provides better quality sound. This type
has a bit rate of 6.3kbps.
G.726 Allows for several different bit rates, including 16,
24, 32 and 40.
G.728 Provides good voice quality and is specially
designed for latency applications. It compresses voice into a
16kbps stream.
G.729 one of the best voice quality CODECs. It converts
voice into an 8kbps stream. There are two versions of this
CODEC, G.729 and G.729a. G.729a has a more simplifies
algorithm over G.729, allowing the end phones to have less
processing power for the same level of quality.

IV. VOIP QUALITY OF SERVICE


A. Weakness of VOIP QoS
Voice communication has grown with reliability and
quality. On the other hand, internet has grown fast with
scalability and efficiency. VoIP utilizes the advantages from
both technology of different background. But, unlike voice
communication, VoIP suffers from reliability, quality of voice,
QoS, etc. The problems with VoIP QoS are end-to-end delay,
voice clarity (digitization, lost, compression), etc and it has
following weakness due to the complexity of the function and
operation, below are the main VOIP QoS degradation
contributors.
1) Packet loss:
One of the serious problems for sending voice of internet is the
loss of QoS through loss of packets although the ultimate goal
of VoIP is to provide quality of voice as in telephone; it is
difficult to realize such quality due to the loss of packet which
can take place in any place during the packet end to end
journey.
As voice utilizes UDP for the transmission UDP does not
request retransmission of lost packet. Therefore if the packet is
lost this creates cut off of the voice and the noise.
2) Variance of throughput:
Since the design of internet for the data network is different
from PSTN in such way that it does not requires a certain
transmission speed for voice packet, transmitted packet over
internet bandwidth may not be uniform. Telephone transmits
voice at 64Kbps but VoIP transmits voice with compression at
6~16Kbps. Because of the problem acquiring reliable
bandwidth, one cannot guarantee the quality of uniform voice
transmission. Voice utilizes UDP for the transmission.
However, unlike TCP, UDP does not request retransmission of
lost packet. If the packet is lost because of the bandwidth
problem, this creates cut off of the voice and the noise.
3) Delay and Jitter:
One of the most stopping issues of using IP network for voice
transmission is the delay and jitter. In IP network we
distinguish several types of delays that vary from their source,
mechanism of creation and other features. Each component of

the delay influences the resulting delay in a voice packet


differently.
Below are some of the main delay components:
Codec, Compression and Decompression Delay:
Time needed for coding, compression or
decompression of a voice block.
Packetization and Depacketization Delay: delay
appears in the process of encapsulating the data
blocks into packets.
Processing Delay in routers: this is the time needed
for a router to take packets from the input interface to
the buffer.
Serialization Delay: The delay in the operation of
sending packets which depend on the transmission
speed and packet size.
Propagation Delay: Delay relevant to the physical
transfer of the signal in the environment or the
materials, depend on the used transmission
technology and the distance that the signal will travel.
De-Jitter Delay: In order to eliminate the impact of
jitter a de-jitter buffer can be used, this buffer form a
source of the end-to-end delay.
Routing Delay: as we are using an IP network,
routing the packets in different directions will case a
packet delay, also queuing and prioritizing the voice
packets (over other data packets) will count for extra
delay.
4) Voice codec and voice compression:
Voice digitization and compression deteriorate the quality of
voice communication as its clarity of voice is affected, and as
delay occurs per codec (coder decoder or compressor decompressor) used to increase the efficiency of bandwidth
usage. PSTN uses G.711 while most VoIP uses more of G.729
or G.723.1.
B. Quality of Experience (QoE)
Here is an increasing recognition that the quality of
network services (QoS) should be evaluated according to their
Quality of Experience (QoE) rather than the classical networkoriented metrics, such as delay, availability, response time,
jitter, echo, and packet loss. Indeed, the interpretation as good
or bad of a QoS metric can be confusing because it ignores its
concrete effect on clients.
Moreover, the QoE can be deemed as ne in spite of a slow
response time, high jitter, or packet loss.
The ITU defines QoE as: A measure of the overall
acceptability of an application or service, as perceived
subjectively by end-user. Precisely, the QoE measure
considers both the services and users quality inuencing
factors. The service factors include availability, reliability, setup and response times, type of terminals, etc. The users factors
include emotions, experience, motivation, and goals. The QoE
measure has a distinct meaning according to the specicity of
each application. For example, a positive QoE measure in the
context of a voice conversation signies that the call is

characterized by an excellent voice transmission quality and


ease of communication. However, a positive QoE measure in
the context of a Web surfer signies that good quality graphics
and pictures are downloaded within an acceptable timeframe.
There are multiple remedies that have been
developed to overcome the negative perceived
effects of such sources of quality degradation.
The existing mechanisms can be classified into
two main approaches:
a) Network-centric strategies:
These enhance the QoE through the integration of suitable QoS
mechanisms within the network to satisfy the specific needs of
delay sensitive services. In such a case, intermediate nodes can
perform call admission, preferential treatment of received
packets, or policy enforcement. This requires an upgrade or
replacement of all existing nodes, which is a significant
challenge, particularly when VoIP calls are routed across
heterogeneous, large scale, and geographically distributed
transport systems.
b) Application-centric strategies:
These seek to improve the QoE through the deployment of
advanced control mechanisms at sender and receiver sides to
smartly deal with sources of quality degradation observed over
data networks. For example, the sender can adapt its
transmission rate, packet loss protection strategy, packetizing
approach, and packet transfer scheduling according to the
prevailing features of delivery pathway. On the other hand, the
receiver can absorb network delay jitter through optimal jitter
buffer strategies and apply certain mechanisms for packet loss
concealment.
The prevailing research trend consists of
integrating network and application QoS
enhancing strategies for the improvement of the
QoE of VoIP under arbitrary network conditions.
V. CONCLUSION
VOIP is an emerging technology for voice
transmission that is expected to replace the old
voice transmission technique; Therefore many
researches are currently held on this topic,
researches are dealing with enhancing the VOIP
quality of service through network and application strategies,
new more efficient ways of testing VOIP QoS and QoE is an
open area for research
Enhancing VOIP QoS is a wide area for research and
currently no single approach win the battle, where VOIP QoS

enhancement researches tend to mix network and application


strategies for the improvement of the QoE of VoIP under
arbitrary network conditions.
Also VOIP protocol developments is still under work,
This document provides a highlight about VOIP concept,
protocol stack, quality of service (QoS) and the new concept
quality of experience (QoE). The document starts by providing
some insight into the principal building blocks of a VoIP
application, details about VOIP protocol stack and available
protocol alternatives that can be used, then the sources of
impairments over data IP networks are identified and
distinguished from signal-oriented sources of quality
degradation observed over telecom networks. The difference
between the terms quality of service (QoS) and the new
concept quality of experience (QoE) is then identified, with a
highlight of the different voice codecs available.

REFERENCES
[1] Khansnabish, B. Implementing Voice over IP. New York: Wiley,
2003.
[2] Davidson, Peters, J. Bhatia, M., Kalidindi, S.Mukherjee, S.
Voice over IP Fundamentals. 2nd Ed. Cisco Press,2006.
[3] Mohd., Alias and Ong, Lee Loon (2007) Performance of voice
over IP (VoIP) over a wireless LAN (WLAN) for different
audio/voice codecs. Jurnal Teknologi, 47 (D). pp. 39-60.
[4] G. Zhang, M. Hillenbrand, and P. Muller, Facilitating the
Interoperability among Different VoIP Protocols with VoIP Web
Services, Proc. of First International Conference on Distributed
Frameworks for Multimedia Applications (DFMA05),
Besanon(France), Feb. 2005, pp. 39-44.
[5] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J.
Peterson, R. Sparks, M. Handley, E.Schooler, "SIP: Session
Initiation Protocol", RFC 3261, IETF, June 2002.
[6] H.Schulzrinne, S.Casner, R. Frederick, V. Jacobson, RTP: A
Transport Protocol for Real-Time Applications, RFC 3550,
July2003.
[7] S. Jelassi, G. Rubino, H. Melvin, H. Youssef, and G. Pujolle,
"Quality of Experience of VoIP Service: A Survey of Assessment
Approaches and Open Issues". IEEE Communications Surveys
and Tutorials, 2012, pp.491-513.
[8] KYRBASHOV, Bakyt; BAROK, Ivan; KOVIK, Mat;
JANATA, Viktor, Evaluation and Investigation of the Delay in
VoIP Networks, 2011CERadioengineering;Jun2011, Vol. 20
Issue2,pp540.
[9] J.Saldana, J.Murillo, J.Fernndez-Navajas, Ruiz-Mas, Viruete
Navarro, Aznar, Evaluating the influence of multiplexing
schemes and buffer implementation on perceived VoIP
conversation quality, The International Journal of Computer
and Telecommunications Networking, May, 2012, Volume 56
Issue7.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy