Avvasi Confidential: References
Avvasi Confidential: References
Avvasi Confidential: References
REFERENCES
1. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2010–2015 (February 2011). Source: http://www.cisco.com/en/
US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html.
4
Avvasi Confidential
Measuring Quality of Experience for Over-the-Top Video Services
The source, and therefore the quality of the content, The video file is stored on a server or CDN where users
can range from user-generated videos shot using a can access it. The video file is either downloaded (as
smartphone to major studio productions shot by a is the case with iTunes) or streamed (as is the case
professional camera crew using commercial-grade with Netflix). In order to watch downloaded video,
equipment. With social media sites, video tends to be the entire file must be received before playback
user-generated which means the content is typically can begin, which can take a long time. The file
shot using lower-grade equipment, and the resulting is stored ‘more permanently’ and is available for
source quality tends to be lower. future consumption. For steaming video, playback
begins almost immediately after the user requests
the content. The file is typically stored in a ‘more
Content is authored, either by a user who does their temporary’ location, and is generally not available for
own editing, or automatically. The processes that future consumption.
automatically author the content are often hidden
5
Avvasi Confidential
Measuring Quality of Experience for Over-the-Top Video Services
It is important to note that video files tend to be compensate for changes in network throughput.
streamed and not downloaded so that playback can More recently, adaptive streaming technologies have
begin before the entire file has been received. When been introduced which enable clients to respond to
a subscriber requests the video content it is delivered changes in network throughput and switch to lower
from the content source by streaming across a packet bitrate streams when the network is congested.
network. The client buffers sufficient incoming data to
enable continuous real-time decoding and playback.
With sufficient network throughput, a client receives
the video data at a faster rate than playback. Therefore
This process may sound simple enough, but it is brief outages or reductions in throughput can be
complex and full of opportunities for issues to arise; tolerated without impacting QoE, as long as the
issues that affect the quality of the delivery and/or the buffer stays full. However, during times of congestion
quality of the presentation. Even a small percentage or poor connectivity, the video buffer may become
of packet loss can cause quality degradation. Some empty which will result in stalling (hourglass)
streaming technologies include mechanisms for the and therefore poor QoE. If an adaptive streaming
device to switch to a lower bitrate profile (such as protocol is in use, the client can try switching to a
adaptive streaming) in order to use less bandwidth lower bandwidth stream, which may reduce stalling,
on the network and increase the probability of but will degrade visual quality through a reduction in
successful delivery. resolution and bitrate.
6
Avvasi Confidential
Measuring Quality of Experience for Over-the-Top Video Services
Figure 4: A comparison of typical buffering strategies for conventional RTSP vs Progressive Download
REAL-TIME STREAMING PROTOCOL seamlessly deliver the content. These protocols are
founded on the premise that smooth delivery is the
Real-time Streaming Protocol (RTSP), along with biggest contributor to overall high video QoE. How
Real-time Transport Protocol (RTP) and Real-time the client decides which stream to select is specific to
Transport Control Protocol (RTCP), is commonly used the client. Some clients are more aggressive and will
to deliver live and on-deck content as well as Video- select the best quality stream first; whereas, others
On-Demand (VOD) services. RTSP is used to establish are more conservative and will select lower-quality
and control the media session, to issue commands streams and monitor performance before switching
during the session, and is delivered over TCP. Most to improve quality.
RTSP servers use RTP to deliver the media streams,
typically over unreliable UDP. The media is therefore
prone to significant quality degradation due to packet
loss. Because of this, another approach called RTSP There are many examples of this technology
interleaved (which interleaves the RTP and RTCP data including HTTP Live Streaming (HLS), HTTP Dynamic
with the RTSP data) can be used. Instead of having Streaming, Microsoft Silverlight Smooth Streaming,
one flow for RTSP and separate flows for audio and and Netflix Streaming Service.
video tracks, a single RTSP/TCP flow is used. The RTSP
data is sent as is, while RTP and RTCP are multiplexed The impact of dynamic streaming protocols is to
through virtual channels. offer a real-time tradeoff between the visual fidelity
of the video and the throughput. However, due to
the fact that clients only become aware of network
congestion after the fact, dynamic streaming tends to
be reactive and causes a high degree of visual quality
REAL-TIME MESSAGING PROTOCOL variation, which in itself can lead to an overall worse
QoE for the subscriber.
Real-time Messaging Protocol (RTMP) is a protocol
developed by Macromedia (now Adobe) for
streaming audio, video and data to a Flash player.
Common variants include RTMPE (encrypted), and HTML5 VIDEO
RTMPS, which works over an SSL connection. RTMP
is supported by Flash Media Server and Flash clients. HTML5 augments and expands the HTML standard
to include a method to natively embed video on a
website. This approach eliminates the dependence
upon third-party browser plug-ins. HTML5 is
ADAPTIVE OR DYNAMIC STREAMING supported by newer browsers such as Internet
Explorer 9, FireFox 3.5, Safari 3.0, Chrome and Opera.
With adaptive streaming, the client detects network While the standard is open, there are competing
bandwidth availability and dynamically switches interests, the standard is in flux, and browser
across multiple streams of differing bitrates to
7
Avvasi Confidential
handoff, interference or resource contention can
vendors are free to support any video format they improving as the number of subscribers increases.
feel appropriate. YouTube uses HTML5 to deliver There are many P2PTV networks including PPLive,
content to Apple iOS devices such as the iPhone and SopCast, StreamTorrent, Veetle, and SwarmPlayer.
iPad. HTML5 video can use various container formats
including WebM, MP4 and HLS.
VIDEO CONFERENCING AND VIDEO CHAT
Video chat applications such as Skype or Apple
PEER-2-PEER TV FaceTime are introducing a whole new set of OTT
Peer-2-Peer TV (P2PTV) delivers media over multiple video use cases. The key difference between video
peer connections. In P2PTV, each client downloads chat and streaming is that video chat needs to be
a video stream while concurrently uploading that delivered at a very low latency in order to satisfy
stream to other P2P users. This approach is akin to real-time two-way communication and it must be
a real-time BitTorrent. Streams are typically time- streamed bi-directionally. Popular video chat services
delayed by several minutes compared to the original include Skype (proprietary, RC4 encrypted signaling
source content. Video quality is a factor of the number protocols) and Apple’s FaceTime (SIP and RTP base
of subscribers in the peer network, with quality
CAPTURE TRANSMISSION
Poor video capture can be the result of a poor capture There are two major network factors that affect video
environment, e.g. lighting, a low-quality lens, poor quality: congestion and connectivity. The volume of
focus, low resolution, camera motion, etc. With the data required to deliver media (with acceptable QoE)
exception of sophisticated pre- and post-processing is significantly more than for voice or other data form
of the captured video, it is very difficult for any future such as email or static web content. The maximum
step in the lifecycle to improve the quality of poorly amount of traffic that can be simultaneously
captured content. delivered to subscribers represents the total capacity
of the network. Based on the number of concurrent
subscribers in a given cell sector, backhaul link or
otherwise limited aggregation point, and the amount
of network traffic that they generate, this can lead
to congestion (where demand exceeds capacity).
Congestion can lead to dropped packets, delayed
delivery of data or even service interruption. For TCP
based (i.e. reliable) non-adaptive streaming sessions
this can result in long delays in initial playback as
well as stalling. For TCP-based adaptive streaming
sessions this can also result in long delays in initial
playback but the stalling is mitigated (though not
necessarily eliminated) by switching to clips authored
with lower fidelity and therefore lower bandwidth
requirements.