Sub-Band Coding
Sub-Band Coding
Sub-Band Coding
In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a
number of different frequency bands, typically by using a fast Fourier transform, and encodes each one
independently. This decomposition is often the first step in data compression for audio and video signals.
SBC is the core technique used in many popular lossy audio compression algorithms including MP3.
The more bits used to represent each sample, the finer the granularity in the digital representation, and thus
the smaller the quantization error. Such quantization errors may be thought of as a type of noise, because
they are effectively the difference between the original source and its binary representation. With PCM, the
audible effects of these errors can be mitigated with dither and by using enough bits to ensure that the noise
is low enough to be masked either by the signal itself or by other sources of noise. A high quality signal is
possible, but at the cost of a high bitrate (e.g., over 700 kbit/s for one channel of CD audio). In effect, many
bits are wasted in encoding masked portions of the signal because PCM makes no assumptions about how
the human ear hears.
Coding techniques reduce bitrate by exploiting known characteristics of the auditory system. A classic
method is nonlinear PCM, such as the μ-law algorithm. Small signals are digitized with finer granularity
than are large ones; the effect is to add noise that is proportional to the signal strength. Sun's Au file format
for sound is a popular example of mu-law encoding. Using 8-bit mu-law encoding would cut the per-
channel bitrate of CD audio down to about 350 kbit/s, half the standard rate. Because this simple method
only minimally exploits masking effects, it produces results that are often audibly inferior compared to the
original.
Basic principles
The utility of SBC is perhaps best illustrated with a specific example. When used for audio compression,
SBC exploits auditory masking in the auditory system. Human ears are normally sensitive to a wide range
of frequencies, but when a sufficiently loud signal is present at one frequency, the ear will not hear weaker
signals at nearby frequencies. We say that the louder signal masks the softer ones.
The basic idea of SBC is to enable a data reduction by discarding information about frequencies which are
masked. The result differs from the original signal, but if the discarded information is chosen carefully, the
difference will not be noticeable, or more importantly, objectionable.
First, a digital filter bank divides the input signal spectrum into some number (e.g., 32) of subbands. The
psychoacoustic model looks at the energy in each of these subbands, as well as in the original signal, and
computes masking thresholds using psychoacoustic information. Each of the subband samples is quantized
and encoded so as to keep the quantization noise below the dynamically computed masking threshold. The
final step is to format all these quantized samples into groups of data called frames, to facilitate eventual
playback by a decoder.
Decoding is much easier than encoding, since no psychoacoustic model is involved. The frames are
unpacked, subband samples are decoded, and a frequency-time mapping reconstructs an output audio
signal.
Applications
Beginning in the late 1980s, a standardization body, the Moving Picture Experts Group (MPEG),
developed standards for coding of both audio and video. Subband coding resides at the heart of the popular
MP3 format (more properly known as MPEG-1 Audio Layer III), for example.
Sub-band coding is used in the G.722 codec which uses sub-band adaptive differential pulse code
modulation (SB-ADPCM) within a bit rate of 64 kbit/s. In the SB-ADPCM technique, the frequency band
is split into two sub-bands (higher and lower) and the signals in each sub-band are encoded using ADPCM.
External links
Sub-Band Coding Tutorial (https://web.archive.org/web/20070613152917/http://www.otolith.
com/otolith/olt/sbc.html)