Chapt 06
Chapt 06
1 Li & Drew
Sound Facts
Wave Characteristics
Frequency: Represents the
number of periods in a second
and is measured in hertz (Hz) or
cycles per second.
Air Pressure
Amplitude
Human hearing frequency
range: 20Hz to 20kHz (audio) Time
2
Analog Audio
Most natural phenomena around us are continuous;
they are continuous transitions between two
different states.
Sound is not exception to this rule i.e. sound also
constantly varies.
Continuously varying signals are represented by
analog signal.
Signal is a continuous function f in the time domain.
For value y=f(t), the argument t of the function f
represents time. If we graph f, it is called wave.
3
Analog Audio
4
A wave has three characteristics:
Amplitude
Frequency, and
Phase
5
Amplitude
It is the intensity of signal.
This is can be determined by looking at the height
of signal.
If amplitude increases, the sound becomes louder.
Amplitude measures the how high or low the
voltage of the signal is at a given point of time.
6
Frequency and Phase
Frequency:
It is the number of times the wave cycle is repeated. This can
be determined by
counting the number of cycles in given time interval.
Frequency is related with pitchness of the sound.
Increased frequency high pitch.
Phase: related to the wave`s appearance.
When sound is recorded using microphone, the microphone
changes the sound into analog representation of the sound.
In computer, we can`t deal with analog things.
This makes it necessary to change analog audio into digital
audio. How? Read the next topic.
7
Principles of Digitization
Sampling: Divide the horizontal axis (time) into discrete pieces
Quantization: Divide the vertical axis (signal strength - voltage) into
pieces. For example, 8-bit quantization divides the vertical axis into
256 levels. 16 bit gives you 65536 levels. Lower the quantization,
lower the quality of the sound
Linear vs. Non-Linear quantization:
• If the scale used for the vertical axis is linear we say its
linear quantization;
• If its logarithmic then we call it non-linear (-law or A-law in
Europe). The non-linear scale is used because small
amplitude signals are more likely to occur than large
amplitude signals, and they are less likely to mask any
noise.
8
Sampling and Quantization
Sample
Sample
Time Time
9
Digitization Process (Sampling, Quantization, and Coding)
10
Sample point
11
Sample point
Example:
The sampling points in the above diagram are A, B, C, D, E, F, H, and I.
The value of sample at point A falls between 2 and 3, may be 2.6.
This value should be represented by the nearest number.
We will round the sample value to 3. Then this three is converted into binary and
stored inside computer.
Similarly, the values of other sampling points are:
B=1
C=3
D=1
E=3
F=1
G=2
H=3
I=1
The values of most sample points are quantized. After quantization, we convert
sample values into binary digits.
12
Sample Rate
A sample is a single measurement of amplitude.
The sample rate is the number of these measurements taken every
second.
In order to accurately represent all of the frequencies in a recording
that fall within the range of human perception, generally accepted as
20Hz to 20KHz, we must choose a sample rate high enough to
represent all of these frequencies.
At first consideration, one might choose a sample rate of 20 KHz since
this is identical to the highest frequency.
This will not work, however, because every cycle of a waveform has
both a positive and negative amplitude and it is the rate of alternation
between positive and negative amplitudes that determines frequency.
Therefore, we need at least two samples for every cycle resulting in a
sample rate of at least 40 KHz.
13
Sampling Theorem
Sampling frequency/rate is very important in
order to accurately reproduce a digital version of
an analog waveform.
14
Nyquist Theorem
16
Aliasing
What exactly happens to
frequencies that lie above
the Nyquist frequency? First,
we.ll look at a frequency that
was sampled accurately:
In this case, there are more
than two samples for every
cycle, and the measurement
is a good approximation of
the original wave.
we will get back the same
signal we put in later on
when converting it into
analog.
17
Aliasing
Remember: speakers
can play only analog
sound. You have to
convert back digital
audio to analog when
you play it.
If we under sample
the signal, though, we
will get a very
different result:
18
Aliasing
In this diagram, the blue wave (the one with short cycles) is the original
frequency.
The red wave (the one with lower frequency) is the aliased frequency
produced from an insufficient number of samples.
This frequency, which was in all likelihood a high partial in a complex
timbre, has folded over and is now below the Nyquist frequency.
For example, a 11KHz frequency sampled at 18KHz would produce an alias
frequency of 7KHz.
This will alter the timbre of the recording in an unacceptable way.
Under sampling causes frequency components that are higher than half of
the sampling frequency to overlap with the lower frequency components.
As a result, the higher frequency components roll into the reconstructed
signal and cause distortion of the signal.
This type of signal distortion is called aliasing.
19
Common Sampling Rates
8KHz: used for telephone
11.025 KHz: Speech audio
22.05 KHz: Low Grade Audio (WWW Audio,
AM Radio)
44.1 KHz: CD Quality audio
20
Audio Quality vs. Data Rate
• The uncompressed data rate increases as more bits
are used for quantization. Stereo: double the
bandwidth. to transmit a digital audio signal.
Li & Drew 21
Sample Resolution/Sample Size
Each sample can only be measured to a certain degree of accuracy.
The accuracy is dependent on the number of bits used to represent the amplitude,
which is also known as the sample resolution.
How do we store each sample value (quantized value)?
8 Bit Value (0-255)
16 Bit Value (Integer) (0-65535)
The amount of memory required to store t seconds long sample is as follows:
If we use 8 bit resolution, mono recording
memory = f*t*8*1
If we use 8 bit resolution, stereo recording
memory = f*t*8*2
If we use 16 bit resolution, and mono recording
memory = f*t*16*1
If we use 16 bit resolution, and stereo recording
memory =f* t*16*2
where f is sampling frequency, and
t is time duration in seconds
22
example
Examples:
Abebe sampled audio for 10 seconds. How much
storage space is required if
a) 22.05 KHz sampling rate is used, and 8 bit resolution
with mono recording?
b) 44.1 KHz sampling rate is used, and 8 bit resolution
with mono recording?
c) 44.1 KHz sampling rate is used, 16 bit resolution with
stereo recording?
d) 11.025 KHz sampling rate, 16 bit resolution with
stereo recording?
23
example
Solution:
a) m=22050*8*10*1
m= 1764000bits=220500bytes=220.5KB
b) m=44100*8*10*1
m= 3528000 bits=441000butes=441KB
c) m=44100*16*10*2
m= 14112000 bits= 1764000 bytes= 1764KB
d) m=11025*16*10*2
m= 3528000 bits= 441000 bytes= 441KB
24
Implications of Sample Rate
and Bit Size
Affects Quality of Audio
Affects Size of Data
26
An Ideal Recording
We should all strive for an ideal recording.
First, don`t ignore the analog stage of the process.
Use a good microphone, careful microphone placement, high
quality cables, and a reliable analog-to-digital converter.
Strive for a hot (high levels), clean signal.
Second, when you sample, try to get the maximum signal level as
close to zero as possible without clipping.
That way you maximize the inherent signal-to-noise ratio of the
medium.
Third, avoid conversions to analog and back if possible. You may
need to convert the signal to run it through an analog mixer or
through the analog inputs of a digital effects processor.
Each time you do this, though, you add the noise in the analog
signal to the subsequent digital reconversion
27
Quantization (Quality ->SNR)
In any analog system, some Signal
of the to Quantization
voltage is what
you want to measure (signal),Noise Ratioof
and some (SQNR)
it is
random fluctuations (noise). The quantization error
(or quantization
SNR: Signal to Noise ratio captures noise)
the quality of
is the difference
a signal (dB)
between the actual
value of the analog
signal at the sampling
time and the nearest
quantization interval
value.
2
V signal Vsignal
SNR = 10 log = 20 log The largest (worst)
V noise Vnoise quantization error is
2
29
Miscellaneous Audio Facts
A simple and widely used audio compression
method is Adaptive Delta Pulse Code Modulation
(ADPCM). Based on past samples, it predicts the
next sample and encodes the difference between
the actual value and the predicted value.
30