Report 1
Report 1
Report 1
0 2,208
1 author:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Bagus Tris Atmaja on 23 January 2019.
1 The Physiology of Hearing 2
1.1 Outer ear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Middle ear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Inner ear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Mechanism of Hearing 6
2.1 From sound into vibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 From mechanical vibration into fluid vibration . . . . . . . . . . . . . . . 7
2.3 From fluid vibration into nerve impulse . . . . . . . . . . . . . . . . . . . 8
1 The Physiology of Hearing
The function of each auditory peripherals can be studied through physiology while its
structures and parts can be studied through anatomy. The anatomy of human ears based
on its location and form of signal inside it (acoustic, vibration or electric) can be divided
into three parts: outer ear, middle ear and inner ears. Figure 1 shows the anatomy of
the human ear.
dB. This enhancement is most pronounced for sounds in the frequency range of
roughly 2 to 7 kHz and so, in part, determines the frequencies to which the ear is
most sensitive [4]. Here, function of concha is given with other peripherals as it is
part between pinna and ear canal.
• Auditory ossicles
Auditory ossicles are consisted of the following smallest three bones in human body
to transfer the vibration of tympanic membrane to cochlea.
– malleus (hammer): forms a rigid connection with the incus
– incus (anvil): forms a flexible connection with the stapes
– stapes (stirrup): connects to oval window
The inward-outward movement of the tympanum displaces the malleus and incus
and the action of these two bones alternately drives the stapes deeper into the oval
window and retracts it, resulting in a cyclical movement of fluid within the inner
• Eustacian tube
The eustachian tube or auditory tube helps ventilate the middle ear and maintain
equal air pressure on both sides of the tympanic membrane, inside middle ear and
outside the body, via nasopharyx (the nasal part of the pharynx, lying behind the
nose and above the level of the soft palate).
The overal function of midlle ear is to amplify the vibrations of tympanic membrane to
oval window. It also matces low acoustic impedance of air to high acoustic impedance
of fluid inner ear (impedance matching). The more details anatomy of the middle ear
can be shown in Figure 2.
Figure 2: Anatomy of middle ear
degrees to each other for maximum ability to detect angular rotation of the
head. This organ mediate interactions between the vestibular system and eye
muscles via cranial nerve. Hence, it plays a smooth movement of the eyes
toward the left and right, keeping the visual field stable as the head turns.
This fluid-filled tubes is the main organ to keep our body balance.
• Cochlea
Cochlea is the main peripherals in the auditory system consisting the following
– Two membranes:
∗ Raissner’s membrane
Together with the basilar membrane it creates a compartment in the
cochlea filled with endolymph, which is important for the function of
the spiral organ of Corti. Based on experiment evidence, the reissner’s
membrane is believed plays important role in otoacoustic as a wave on
Reissner’s membrane can propagate along the whole extent of the cochlea
∗ Basilar membrane
It forms the division between the scala media and tympani. Physical char-
acteristics of the basilar membrane cause different frequencies to reach
maximum amplitudes at different positions. BM performs frequency se-
lectivity by its filter bank. Hence, BM is is effectively a continuous array
of filters which decompose a complex sound waveform into into its con-
stituent frequency components.
– Three compartments:
∗ Scala vestibuli (vestibular ducts): conducts sound vibrations to the cochlear
∗ Scala tympani (tympanic ducts): together with vestibular duct to trans-
duce the movement of air that causes the tympanic membrane and the
ossicles to vibrate, to movement of liquid and the basilar membrane.
∗ Scala media (cochlear duct): houses of organ corti that transform fluid
vibration into nerve impulse.
– Oval window
It receives vibration from stapes and transmit to base of basilar membrane.
– Round window
It vibrates with opposite phase to vibrations entering the inner ear through
the oval window. It allows fluid in the cochlea to move, which in turn ensures
that hair cells of the basilar membrane will be stimulated and that audition
will occur.
– Organ of corti
Organ of corti transduces auditory signals and minimize the hair cells’ extrac-
tion of sound energy. It consist of the following two hair cells and tectorial
∗ Inner hair cells (IHC): detect the sound and transmit it to the brain via
the auditory nerve.
∗ Outer hair cells (OHC): perform an amplifying role.
∗ Tectorial membrane: the function for human is not clear yet, but TM
may be involved in the longitudinal propagation of energy in the intact
cochlea [6]. The stereocillia on the tip of the hair cells respond to fluid
motion when the basilar membrane displaces to the right or left.
• Endolymph and perilymph
Endolymph is fluid contained in the membranous labyrinth of the inner ear while
perylimph is extracellular fluid inside perylmphatic space. It is located between
the outer wall of the membranous labyrinth and the wall of the bony labyrinth.
Their function is to regulate electrochemical impulses of hair cells. Perilymph
resembles extracellular fluid in composition (sodium salts are the predominate
positive electrolyte) while endolymph resembles intracellular fluid in composition
(potassium is the main cation).
Endolymph also has another two function:
– Hearing: fluid waves in the endolymph of the cochlear duct stimulate the
receptor cells, which in turn translate their movement into nerve impulses
that the brain perceives as sound.
– Balance: angular acceleration of the endolymph in the semicircular canals
stimulate the vestibular receptors of the endolymph. The semicircular canals
of both inner ears act in concert to coordinate balance
Detail anatomy of the inner ear explained above can be shown in figure 3.
2 Mechanism of Hearing
The hearing processes sound wave into perceived sounds translated from nerve impulse
(electric activity). This process is known as auditory transduction. In order to transform
sound to neural nervous system, the energy of sound wave transduced into into three
transformation: vibration, hydraulic motion (fluid vibration) and nerve impulse. The
mechanism of hearing will be divided into those three parts.
Figure 3: Detail anatomy of inner ear
The minimum detectable level of sound to reach pinna corresponds to an energy flow
(or intensity) of 10−12 W/m2 in a sound pressure wave (threshold of hearing). Then, due
to its shape the most sensitive frequencies is from 1000 to 4000 Hz. In the outer ear,
sound frequencies from 1500 to 7000 Hz is amplified about 10-15 dB. The last step in
this stage is sound wave hit the ear drum causing mechanical vibration through ossicles.
Figure 5: Vibrations in the middle ear [7]
From the illustration in Figure 5, it is shown that the area of stapes footplate is very
small compared to the area of tympanic membrane. The actual ratio is 17. Hence, the
power amplification is 17.
= 17 × 1.2 = 20
Now, the form of signal is fluid vibration in the cochlear duct as stapes pushes oval
window forth and back.
Figure 6: Sound transduction in coiled (up) and uncoiled cochlea (bottom)
the tectorial membrane, causing the stereocilia on the hair cells to be bent to the right.
Displacement of the BM toward the scala tynpani produces the opposite effect, causing
the stereocilia to be bent to the left. When the stereocilia are bent to the right, the tip
links are stretched and ion channels are opened. Positively charged potassium ions (K+)
enter the cell, causing the interior of the cell to become more positive (depolarization). It
stimulates them to send nerve impulse to the cochlear nerve and on the brain. When
the stereocilia are bent in the opposite direction the tip links slacken and the channels
close. Here the sounds we perceived are processed in different loudness and pitches by
cochlea. The loudness is corresponds to the high of fluid wave and the frequencies we
perceived are corresponds to the frequency selectivity of basilar membrane.
Figure 6 shows coiled cochlea and its uncoiled section where sound transduction were
performed to convert fluid vibration into nerve impulse.
If a system does not satisfy those two principles above, then it is nonlinear system.
The following subsection describes the evidence that our hearing system is nonlinear
BW = fch − fcl
As this auditory filter is bank of filters, those bandwidth above is not single, but many
(overlapped) bandwidths. The psychoacoustics studies reveal that the human perception
of frequency contents of sound does not follow linear scale. This can be observed by
listening linearly spaced tone vs nonlinearly spaced tones (e.g. logarithmically spaced
tones). Therefore, auditory researcher proposed some scales for this filter bank: bark,
erb and mel scale.
The bark scale ranges from 1 to 24 Barks, corresponding to the first 24 critical bands
of hearing. The published Bark cut-off frequencies are given in Hertz as [0, 100, 200,
300, 400, 510, 630, 770, 920, 1080, 1270, 1480, 1720, 2000, 2320, 2700, 3150, 3700, 4400,
5300, 6400, 7700, 9500, 12000, 15500]. The published band centers in Hertz are [50, 150,
250, 350, 450, 570, 700, 840, 1000, 1170, 1370, 1600, 1850, 2150, 2500, 2900, 3400, 4000,
4800, 5800, 7000, 8500, 10500, 13500] [9]. The following formula can be used to convert
frequency in Hz to Bark scale.
−1 f
FBark = 6 sinh .
This bark scale is derived from critical band concept (Zwickers model). Although this
critical band terminology is wrongly used according to psychoacoustics, it is well used
on engineering science.
Moore and Glasberg proposed the ERB scale modifying Zwickers loudness model. The
ERB scale is a measure that gives an approximation to the bandwidth of filters in human
hearing using rectangular bandpass filters. The formula to compute a frequency in terms
of ERB-rate from a given frequency f in Hz is:
Figure 7: Saturation of OHC active force production [12]
The last well known scale that also used to model auditory filter is mel scale. It is not
come from experiments to model auditory filter, but a measurement of pitches judged
by listeners to be equal in distance from one another. The proposed the unit ”mel” scale
defined 1000 mels as the pitch at 1000 Hz and one mel to be 1/1000 of that pitch on
the subjective scale. The results of the 1940 experiment are displayed in the following
Fmel = 1127 ln 1 +
Some scales of auditory filter above shows that, the human ear processes fundamental
frequency on a nonlinear logarithmic scale rather than a linear scale. Actually, the
response of auditory filter is nonlinear level dependence [11]. At low level it almost
linear, but highly nonlinear at high level. The use of nonlinear model such as double-
resonance nonlinear (DRNL) also gives better result compared to linear gammatone filter
bank [10].
Figure 8: Response of basilar membrane velocity as function of sound pressure level [12]
narrow range of stereocilia deflection.Let us assume that various tectorial membrane por-
tions behave as a secondary system of oscillators affected by appreciable damping, with
resonance frequencies close to the characteristic frequency all over the basilar membrane
Figure 9: Two-tone suppression in auditory nerve [13]
Figure 9 shows the phenomena of two-tone suppression. The threshold tuning curve
(dark red) plots the responses of one auditory nerve (AN) fiber with a characteristic
frequency of 8000 Hz. Whenever a second tone is played at the frequencies and levels
within the light-red areas to each side, the response of this AN fiber to an 8000-Hz tone
is reduced (suppressed), about 20% [13].
is too low to be perceived as an audible tone or pitch. So, the first part will be perceived
by our ears, as amplitude modulation (AM). The frequency of fine structure carrier wave
modulation is f1 +f2
(average) while the frequency of modulated signal (slowly varying
function, envelope) is f1 −f 2
2 . Becuse there is two beats in every one period of envelope,
so the frequency of beat is,
fbeat = f1 − f2
Figure 10 shows beats or amplitude modulation of tone 200 Hz with 210 Hz.
If the frequency difference between two tone is higher than 50 Hz, then combination
tone occurs. Instead of resulting beats, some new tone appears i.e:
0 250 500 750 1000 1250 1500 1750 2000
Figure 10: Amplitude modulation of two frequencies, 200 Hz (up) and 210 Hz (middle)
resulting beating waveform (bottom)
Two prominent difference tones are the quadratic difference tone f2 − f1 , (sometimes
referred to simply as ”the difference tone”) and the cubic difference tone 2f1 − f2 . If f1 ,
remains constant (for example at 1000 Hz) while f2 increases, the quadratic difference
tone moves upward with f2 while the cubic difference tone moves in the opposite direc-
tion. At low levels ( approximately 50 dB), they can be heard from about f2 /f1 , = 1.2
to 1.4, but at 80 dB they are audible over nearly an entire octave, f2 /f1 = 1 to 2. In
this case, quadratic and cubic difference tones cross over at f2 /f1 = 1.5. Over part of
the frequency range, the quadratic difference tone 3f1 − f2 may be audible.
A demonstration to examine this combination tone can be simply performed by listen-
ing two different tones via stereo speakers. Each channel delivers different frequencies
(to eliminate distortion/nonlinearity of audio system). For headphones experiment, the
summation of two tones can be performed on both channels. By performing this exper-
iment, the distortion of f2 − f1 , 2f1 − f2 , etc can be heard.
Due to the nonlinearity of inner ear, distortion occurs whenever a sum of difference
tones was heard. However, experiments reports that listeners heard beat additional
listening two tone delivered separately on each channel by using headphones. This
might neural phenomenon called binaural beat rather than physics of inner ear.
A binaural beat is an auditory illusion perceived when two different pure-tone sine
waves, both with frequencies lower than 1500 Hz, with less than a 40 Hz difference
between them, are presented to a listener dichotically (one through each ear).
Binaural-beat perception originates in the inferior colliculus of the midbrain and the
superior olivary complex of the brainstem, where auditory signals from each ear are
integrated and precipitate electrical impulses along neural pathways through the reticular
formation up the midbrain to the thalamus, auditory cortex, and other cortical regions.
[3] Neuroscience. 2nd edition. Purves D, Augustine GJ, Fitzpatrick D, et al., editors.
Sunderland (MA): Sinauer Associates; 2001.
[5] Reichenbach, T., Stefanovic, A., Nin, F., & Hudspeth, A. J. (2012). Waves on Reiss-
ner’s membrane: a mechanism for the propagation of otoacoustic emissions from the
cochlea. Cell reports, 1(4), 374–84.
[6] Meaud, Julien; Grosh, Karl (2010). ”The effect of tectorial membrane and basilar
membrane longitudinal coupling in cochlear mechanics”. The Journal of the Acousti-
cal Society of America. 127 (3): 1411. doi:10.1121/1.3290995. ISSN 0001–4966. PMC
[8] Bo Wen (2006).”Modeling The Nonlinear Active Cochlea: Mathematics And Analog
VLSI”, PhD dissertation, University of Pennsylvania.
[9] Julius O. Smith and III and Jonathan S. Abel, (1999). “Bark and ERB Bilinear
Transforms”, IEEE Transactions on Speech and Audio Processing.
[11] Lyon, R. F., Katsiamis, A. G., and Drakakis, E. M. (2010, May). History and future
of auditory filter models. In Circuits and Systems (ISCAS), Proceedings of 2010 IEEE
International Symposium, pp. 3809-3812.
[13] Jeremy M. Wolfe et. al., (2017). Sensation and Perception, Oxfor University Press.