0% found this document useful (0 votes)
15 views

DAS9T02 - Data Reduction

The document discusses audio data reduction techniques, including how redundancy and irrelevancy in audio signals can be exploited to compress files. It describes dividing audio into critical bands, Fourier analysis of time intervals, comparing signals to psychoacoustic models to determine relevant information, allocating bit budgets to sub-bands, and quantizing sub-band samples with assigned bits.

Uploaded by

chaminkv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

DAS9T02 - Data Reduction

The document discusses audio data reduction techniques, including how redundancy and irrelevancy in audio signals can be exploited to compress files. It describes dividing audio into critical bands, Fourier analysis of time intervals, comparing signals to psychoacoustic models to determine relevant information, allocating bit budgets to sub-bands, and quantizing sub-band samples with assigned bits.

Uploaded by

chaminkv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Audio Data Reduction

- Why we use it

- How it works

- The current standards

- The pros and cons

│ DW-AKADEMIE │ Seite 1
>
Why we use it

Data rate of a stereo bit stream:


16bit x 48k samples/s x 2 channels = 1536 kb/s (linear PCM)

• Produces high costs


¾ storage
¾ transmission

• Too much for


¾ digital broadcasting
¾ larger computer networks
¾ the internet

│ DW-AKADEMIE │ Seite 2
>
Comparison of data rates

storage time Linear PCM MPEG MPEG MPEG


(stereo) 1.5Mb/s 384kb/s 256kb/s 128kb/s
reduction factor 1 4 6 12
1 hour 0.7 GB 170 MB 120 MB 60 MB

1 day 16 GB 4 GB 2.8 GB 1.4 GB

1 week 120 GB 30 GB 20 GB 10 GB

1 month 500 GB 120 GB 80 GB 40 GB


1 year 6 TB 1.5 TB 1 TB 500GB

10 years 60 TB 15 TB 10 TB 5 TB

│ DW-AKADEMIE │ Seite 3
Capacity requirements

Audio linear PCM MPEG


(48kHz, 24bit) (256kb/sec)

• 1 hour stereo signal 1GB 120MB

• HD of 500GB 500 h 4000 h


• low cost server 2000 GB 2000 h 16 000 h
• large server 5 TB 5 000 h 40 000 h
• very large server 10 TB 10 000 h 80 000 h

│ DW-AKADEMIE │ Seite 4
>
Capacity requirements

Video CCIR 601 MPEG 2 MPEG 2


SDI 270Mb/s ML 24Mb/s DVD 4Mb/s

• 1 hour 120 GB 11 GB 2 GB

• HD of 500GB 4h 45 h 250 h
• low cost server 2000 GB 16 h 180 h 1000 h
• large server 5TB 40 h 450 h 2 500 h
• very large server 10 TB 80 h 900 h 5 000 h
• mass storage system 1 PB 8 000 h 80 000 h 500 000 h

│ DW-AKADEMIE │ Seite 5
>
The basic ideas of data reduction
- Digital audio contains REDUNDANCY
(meaningless data)
- e.g. 00000000000000000000000
could be expressed as ”23 x 0”
- 010111
- achieved through mathematical procedures

information

digital audio data


20 - 80%

(linear PCM)
redundancy
│ DW-AKADEMIE │ Seite 6
>
Compressing Audio Files

- Example of eliminating redundancy


in a audio file sound.wav
¾ The original linear PCM file (48kHz, 24bit)
(e.g. in .WAV format) 100MB


¾ “Zipped” file using lossless
data compression sound.zip
70MB


sound.wav
¾ “Un-zipped” file, (48kHz, 24bit)
identical to the original 100MB

│ DW-AKADEMIE │ Seite 7
>
The basic ideas of data reduction

- Digital audio contains IRRELEVANCY (inaudible information)


- Irrelevancy is subjective and strongly depends on the signal itself

relevant information 5 - 25%

information
irrelevant information

redundant data

│ DW-AKADEMIE │ Seite 8
>
Irrelevancy in audio signals

- Any inaudible sound event is irrelevant


- signals below the quiescent threshold

PCM noise

inaudible
audible
signal
signal

f
│ DW-AKADEMIE │ Seite 9
>
Frequency masking

- Any inaudible sound event is irrelevant


- signals below the quiescent threshold
- signals masked by a neighbouring signal
A
masking threshold

f
│ DW-AKADEMIE │ Seite 10
>
Frequency masking

- listen to this example


750Hz

NBN 900Hz NBN 750Hz


t

f
│ DW-AKADEMIE │ Seite 11
>
Temporal masking

- low level signals after a loud sound event will not be audible
- post-masking

masking threshold

masker masked signal

t
100ms
│ DW-AKADEMIE │ Seite 12
>
Temporal masking

Listen to this example


- Two short bursts of noise

The first burst is masked by the previous music

│ DW-AKADEMIE │ Seite 13
Temporal masking

low level signals shortly before a loud sound event will remain inaudible
- pre-masking

masking threshold

masker
masked signal

t
20ms
│ DW-AKADEMIE │ Seite 14
>
Temporal masking

Listen to this
- The second burst is pre-masked by successive music

│ DW-AKADEMIE │ Seite 15
>
Data reduction encoding

- Dividing the audio band into critical bands


- 32 up to 576 sub-bands
- using filter banks and/or
modified discrete cosine transform (MDCT)

linear filter bank sub-band


PCM or MCDT signals

│ DW-AKADEMIE │ Seite 16
>
Data reduction encoding

- Fourier analysing of a time interval of the signal


- between 256 up to 1024 point fft
- causing delay of 8 to 24ms

linear filter bank


PCM or MCDT

Fast
Fourier
description of a time block
Transform in the frequency domain

│ DW-AKADEMIE │ Seite 17
Data reduction encoding

- Comparison of the signal with the


psycho-acoustical model
- the psycho-acoustic model is not standardised
it can be improved in future developments

linear filter bank


PCM or MCDT

Fast psycho- information about the


Fourier acoustical relevant critical bands and
Transform model their coding requirements

│ DW-AKADEMIE │ Seite 18
Data reduction encoding

- Distribution of the bit budget to the sub-bands


- considers the bit-rate available
- assigns to each sub-band an optimum
number of bit

linear filter bank


PCM or MCDT bit budget
per sub-band

Fast psycho- information


bit about the total
Fourier acoustical
allocation
Transform model bit rate

│ DW-AKADEMIE │ Seite 19
Data reduction encoding

- Quantisation of the sub-band samples


- every sub-band is quantised with the assigned bits
- a scale factor is extracted from the largest signal

sub-band
linear coding
filter bank
PCM or MCDT
scale factor
extraction

Fast psycho-
bit
Fourier acoustical
allocation
Transform model

│ DW-AKADEMIE │ Seite 20
Data reduction encoding

- Coding and bit packing


- serialisation of the sub-band information
- entropy coding
- inserting scale factor and side information

sub-band
linear coding coding
filter bank
and
PCM or MCDT
scale factor bit packing
extraction

Fast psycho-
bit
Fourier acoustical
allocation
Transform model

│ DW-AKADEMIE │ Seite 21
>
Data reduction standard

- MPEG: international standard by ISO


- AC2, AC3: by Dolby Inc.
- ATRAC: Sony, used for Minidisc only
- Ogg Vorbis: Open standard, patent free data reduction by Xiph.org
- WMA: Windows proprietary audio file format
- RealAudio: audio/video format especially for streaming
- G7XX: for low delay audio over ISDN lines (7kHz bandwidth)

│ DW-AKADEMIE │ Seite 22
>
The MPEG standards

- Moving Picture Expert Group


- set up by the ISO
- experts and interested companies
- MPEG 1, IS-11172, October 1992
- part 3: audio coding
- MPEG 2, IS-13818, November 1994
- part 3: low sample rate audio, multi-channel audio
- part 7: Advanced Audio Coding (AAC), non-backward compatible to MPEG 1
- intended MPEG 3 included in MPEG 2
- MPEG 4, IS-14496, July 1999 (Version 2)
- adaptive coding for very low bit rates and multimedia applications
- introduces technologies like CELP and HVXC
- deals with text to speech (TTS) applications

│ DW-AKADEMIE │ Seite 23
>
The MPEG standards

- MPEG 7, IS-15938, July 2001


- multimedia objects description
- MPEG 21, IS-18034, December 2001
- multimedia framework
- strategies for content retrieval and management

│ DW-AKADEMIE │ Seite 24
>
MPEG 1

- Coding of moving picture and associated audio for digital storage media
at up to about 1.5Mb/s
- Part 3 standardised the audio compression formats
- Three Layer were standardised
- Layer 1
- Layer 2
- Layer 3
- The three layer are downward compatible to each other

│ DW-AKADEMIE │ Seite 25
>
MPEG 1

- Layer 1
- low complexity of encoder and decoder
- low compression rate ( 4 )
- relatively high bit rates (192kb/s/ch)
- developed for Philips DCC
- outdated today

Layer 1

│ DW-AKADEMIE │ Seite 26
>
MPEG 1

- Layer 2
- medium complexity of encoder and decoder
- medium compression rate ( 6 )
- moderate bit rates ( 128kb/s/ch)
- developed for DAB
- most commonly used in the studio environment

Layer 2

Layer 1

│ DW-AKADEMIE │ Seite 27
>
MPEG 1

- Layer 3
- high complexity of encoder and decoder
- high compression rate ( 12 )
- low bit rates ( 64kb/s/ch)
- designed for signal transmission (ISDN)
- all future MPEG standards are based on Layer 3

Layer 3

Layer 2

Layer 1

│ DW-AKADEMIE │ Seite 28
>
MPEG 1

Target bit rates of Layer 1, 2 and 3

Layer 3

Layer 2

Layer 1
bit rate (kb/s/ch)
32 64 96 128 160 192 224 256

24 12 8 6 5 4 3
data reduction factor (related to 16bit/48kHz)
│ DW-AKADEMIE │ Seite 29
>
MPEG Stereo Modes
- Mono
- One channel is recorded and transmitted only
- If the input signal is stereo, the encoder will build the mono sum
- Stereo (dual mono)
- This is the true stereo mode
- Two fully independent audio channels (left and right)
will be encoded and transmitted
- Joint Stereo (intensity stereo, mid-side stereo)
- The encoder will eliminate additional redundancy of stereo signals
by coding similar signals in the left and right channel only once.
- Joint stereo provides more effective use of the bit budget
and will therefore reduce artifacts in the signal
- Joint stereo produces a less clear stereo image

│ DW-AKADEMIE │ Seite 30
Data Reduction Sound Demonstration

- MPEG 1 Layer 2 encoding with different bit rates

¾ 384 kb/s dual mono compression rate 1:4


¾ 256 kb/s dual mono compression rate 1:6
¾ 192 kb/s dual mono compression rate 1:8
¾ 128 kb/s dual mono compression rate 1:12
¾ 128 kb/s joint stereo compression rate 1:12
¾ 96 kb/s dual mono compression rate 1:16
¾ 96 kb/s joint stereo compression rate 1:16
¾ 64 kb/s dual mono compression rate 1:24
¾ 64 kb/s joint stereo compression rate 1:24
¾ 64 kb/s mono compression rate 1:12
│ DW-AKADEMIE │ Seite 31
MPEG 1

- Comparison of Layer 2 and Layer 3 features

Layer 2 Layer 3

sub-bands 32 576

entropy coding no yes

bit reservoir technology no yes

time delay 24ms appr. 100ms

│ DW-AKADEMIE │ Seite 32
>
MPEG 2
- Low sample rate audio
- reduced sample rates, reduced audio bandwidth
- reduction of audio bandwidth is less annoying than encoding artefacts
- the compression format for Worldspace satellite radio
- multi-channel applications
- 5+1 audio channels
- used for film, video and DVD application (Europe)
- Advanced Audio Coding (AAC)
- non-backward compatible to MPEG 1
- allows very low bit rates at improved quality
- is widely used for MP3 files in the internet
- the compression format for DRM

│ DW-AKADEMIE │ Seite 33
>
Problems of data reduction

- data reduced audio is not identical with the original


(it only sounds like the original)
- (inaudible) loss in sound quality
- decay of quality with “generations”
- the quality decay is not transparent
- block structure of the MPEG data
- 24ms to 100ms
- editing is not possible within the block
- delay of signal
- encoding/decoding requires time 100ms to 200ms
- problems in real time applications

│ DW-AKADEMIE │ Seite 34
>
More Problems

- Costs for data reduction


- specialised hardware or software produces extra cost
- on the receiving end a special decoder is required

│ DW-AKADEMIE │ Seite 35
>
Conclusions

- Data reduction produces high quality audio but it has its limitations
- Data reduction can be used
- to store signals more economically
- to transmit signals more economically
- to employ new transmission channels (e.g. ISDN)
- in the broadcasting environment for simple radio productions
- Data reduction should not be used
- if the signal is entitled to later sound processing
- during the production of music, drama
or any other complex audio production
- for archiving of important sound material
- if it gives no particular advantages

│ DW-AKADEMIE │ Seite 36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy