Multimedia Compression

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 33

Multimedia

Compression
B90901134 陳威尹
Why Compress
 Raw data are huge.
 Audio:
CD quality music
44.1kHz*16bit*2 channel=1.4Mbps
 Video:
near-DVD quality true color animation
640px*480px*30fps*24bit=220Mbps
 Impractical in storage and bandwidth
Outline
 Generic Compression Overview
 Content specific Compression
 Lossy Compression
Introduction to
Generic Compression
Algorithm
Lossless Compression
Generic Compression
 Also called Entropy Encoding
 Lossless Compression Algorithms
 Entropy can defined as:
 Need statistical knowledge of data
 Well-known Algorithms:
 Rice coding
 Huffman coding
 Arithmetic coding
Huffman encoding
Input: ABACDEAACCAABEAABACBDDABCADDBCEAEAAADBE
Order-0 model
Symbol A B C D E
Count 15 7 6 6 5
total:39*3=117 bits
Output:
15*1+(7+6+6+5)*3=87 bits
Compression ratio:
117/87 = 1.34
Property of Huffman encoding
 Easy to implement, high encoding speed
 Unique Prefix Property: no code is a
prefix to any other code
 Adaptive Huffman encoding:
 statistical
knowledge not available
 update Huffman tree when needed
Arithmetic Encoding
 Symbol X, Y
prob(X) = 2/3
prob(Y) = 1/3
Property of Arithmetic Encoding
 Prevent entropy wasting in Huffman coding,
for the number of bits to represent a symbol
can be non-integer
 About 5~10% smaller than Huffman coding
 Computational intensive
 US patented!!
 Both Huffman and Arithmetic are used in the
entropy encoding stage in JPEG
Application of General
Compression
 Generic file compression like Zip, Rar,
gzip, bzip, etc.
 Final stage of content specific
compression
 JPEG uses Huffman or Arithmetic
 Monkey’s Audio (ape) uses Rice
 Lossless Audio (La) uses Arithmetic
Content specific
Compression
Further De-correlation
De-correlation
 Correlation means redundancy
 However, general algorithm may not find conten
t-specific correlation
 General algorithm of higher order may not be effi
cient enough
 No matter lossy or lossless, multimedia file form
at use content-specific pre-filter as 1st step to red
uce data redundancy.
Correlation in Multimedia
 Audio:
 Temporal, Channel
 Still Image:
 Color space, Spatial, Stereo
 Video:
 Temporal
Audio Channel Correlation
 Correlation between
L/R channels
 L/Rto mid/pass band
conversion
 More complex
decorrelation in more
channels
Color Space Correlation
 Correlation between c
olor channels
 map RGB to YUV colo
r space
Y = 0.299*R + 0.587*G + 0.114*B
U = -0.169*R - 0.331*G + 0.500*B + 128.0
V = 0.500*R - 0.419*G - 0.081*B + 128.0

 Example in PNG
Color Space Correlation
-- RGB to YUV Conversion
R 95KB G 96KB B 98KB

Y 97KB U 32KB V 37KB


Video Channel Correlation
 Multi-view channel in
3D video
 convert to Image and
Depth channel
 Disparity Estimation (li
ke Motion Estimation)
Video Temporal Correlation
 Similarity between adj 
Search Range

acent frames Motion Vector

 Motion estimation and


motion compensation
(mostly Lossy)

Reference Frame

Current Frame
Lossless is not enough!
 The best lossless audio and image compr
ession ratio is normally a half
 Lossy audio compression like mp3 or ogg
achieve 1/20 ratio while remain acceptable
quality, and 1/5 ratio for impeccable quality
 Lossy video compression reduce a film to
1/300 size
Lossy Compression

Loss of data lead to higher compr


ession ratio
Lossy Compression
 Massively reduce information we don’t
notice
 Highly content specific
 Psychology
Lossy Audio Compression
 Frequency domain
 Quantization
 The importance varies in bands
 Higher frequency, larger quantum
 Psychoacoustics
 Pitchresolution of ear is only 2Hz without beating
 Threshold of hearing varies in bands
 Simultaneous and temporal masking effect
Lossy Image Compression
 Frequency domain
 Discrete Cosine Transform (in Jpeg)
 Discrete Wavelet Transform (in J2k)
 Quantization
 Reduce less important data

Image Entropy Output


Transform Quantization
data Coding data
Jpeg2000 vs. Jpeg
DCT 8x8
Discrete Huffman
JPEG Quantization
Cosine Coding
Transform Table

Entropy
Transform Quantization
Coding

DWT Quantization
J2K Discrete Arithmetic
for each
Wavelet Coding
Transform sub-band
Lossy Image Compression in Practi
ce (1)
 Original
Lossy Image Compression in Practi
ce (2)
 Transform
domain
coefficients.
 Only a few
components
are visible for
each 8x8
block.
 The DC
component is
in the upper
left of each
block
Lossy Image Compression in Practi
ce (3)
 After
quantization
and IDCT.
 Note clearly
seen blocky
effect.
 Compression
ratio = 17.8:1
with an SNR
of 20.1 dB,
not including
entropy
encoding
Lossy Video Compression
Motion Estimation
Motion Compensation

Without motion compensation With motion compensation


Frame Type
 Intra Frame (I)
 Predictive Frame (P)
 Bidirectional predictive Frame (B)
Video Compression Demo
 Motion Vector and bandwidth overlaid on
mpeg4 video using ffdshow-20041012
Reference
 Lossless Compression Algorithms
http://www.cs.cf.ac.uk/Dave/Multimedia/node207.html
 Monkey’s Audio
http://www.monkeysaudio.com/theory.html
 Lossless Audio (La)
http://www.lossless-audio.com/theory.htm
 Compression and speed of lossless audio formats
http://web.inter.nl.net/users/hvdh/lossless/main.htm
http://members.home.nl/w.speek/comparison.htm
 http://www.wordiq.com/definition/Wavelet_compression
 http://www.wordiq.com/definition/Psychoacoustics
 http://www.wordiq.com/definition/MP3
 H.264
http://www.komatsu-trilink.jp/device/pdf11/UBV2003.pdf

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy