Resound: Interactive Sound Rendering For Dynamic Virtual Environments
Resound: Interactive Sound Rendering For Dynamic Virtual Environments
Resound: Interactive Sound Rendering For Dynamic Virtual Environments
Virtual Environments∗
ABSTRACT 1. INTRODUCTION
We present an interactive algorithm and system (RESound) Extending the frontier of visual computing, an auditory
for sound propagation and rendering in virtual environments display uses sound to communicate information to a user
and media applications. RESound uses geometric propaga- and offers an alternative means of visualization or media.
tion techniques for fast computation of propagation paths By harnessing the sense of hearing, sound rendering can
from a source to a listener and takes into account specu- further enhance a user’s experience in multimodal virtual
lar reflections, diffuse reflections, and edge diffraction. In worlds [10, 27]. In addition to immersive environments, au-
order to perform fast path computation, we use a unified ditory display can provide a natural and intuitive human-
ray-based representation to efficiently trace discrete rays as computer interface for many desktop or handheld applica-
well as volumetric ray-frusta. RESound further improves tions (see Figure 1). Realistic sound rendering can directly
sound quality by using statistical reverberation estimation impact the perceived realism of users of interactive media
techniques. We also present an interactive audio rendering applications. An accurate acoustic response for a virtual
algorithm to generate spatialized audio signals. The over- environment is attuned according to the geometric repre-
all approach can handle dynamic scenes with no restrictions sentation of the environment. This response can convey
on source, listener, or obstacle motion. Moreover, our algo- important details about the environment, such as the lo-
rithm is relatively easy to parallelize on multi-core systems. cation and motion of objects. The most common approach
We demonstrate its performance on complex game-like and to sound rendering is a two-stage process:
architectural environments.
• Sound propagation: the computation of impulse re-
sponses (IRs) that represent an acoustic space.
Categories and Subject Descriptors • Audio rendering: the generation of spatialized au-
H.5.5 [Information Interfaces and Presentation]: Sound dio signal from the impulse responses and dry (ane-
and Music Computing—modeling, systems; I.3.7 [Computer choically recorded or synthetically generated) source
Graphics]: Three-Dimensional Graphics and Realism—ray- signals.
tracing
Sound propagation from a source to a listener conveys in-
formation about the size of the space surrounding the sound
General Terms source and identifies the source to the listener even when the
source is not directly visible. This considerably improves the
Performance immersion in virtual environments. For instance, in a first-
person shooter game scenario (see Figure 1(b)), the distant
cries of a monster coming around a corner or the soft steps of
Keywords an opponent approaching from behind can alert the player
Acoustics, sound, ray tracing and save them from fatal attack. Sound propagation is also
used for acoustic prototyping (see Figure 1(d)) for computer
∗Project webpage: games, complex architectural buildings, and urban scenes.
http://gamma.cs.unc.edu/Sound/RESound/ Audio rendering also provides sound cues which give direc-
tional information about the position of the sound source
relative to a listener. The cues are generated for headphones
or a 3D surround sound speaker system. Thus, the listener
Permission to make digital or hard copies of all or part of this work for can identify the sound source even when the sound source is
personal or classroom use is granted without fee provided that copies are out of the field of view of the listener. For example, in a VR
not made or distributed for profit or commercial advantage and that copies combat simulation (see Figure 1(a)), it is critical to simulate
bear this notice and the full citation on the first page. To copy otherwise, to the 3D sounds of machine guns, bombs, and missiles. An-
republish, to post on servers or to redistribute to lists, requires prior specific other application of 3D audio is user interface design, where
permission and/or a fee.
MM’09, October 19–24, 2009, Beijing, China. sound cues are used to search for data on a multi-window
Copyright 2009 ACM 978-1-60558-608-3/09/10 ...$10.00. screen (see Figure 1(c)).
(a) (b) (c) (d)
Figure 1: Multimedia applications that need interactive sound rendering (a) Virtual reality training: Virtual
Iraq simulation to treat soldiers suffering from post-traumatic stress disorder (top) and emergency training
for medical personnel using Second Life (bottom). (b) Games: Half-life 2 (top) and Crackdown, winner of
the best use of audio at the British Academy of Film and Television Arts awards (bottom). (c) Interfaces
and Visualization: Multimodal interfaces (top) and data exploration & visualization system (bottom). (d)
Computer aided Design: Game level design (top) and architectural acoustic modeling (bottom).
The main computational cost in sound rendering is the triangles) as well as dynamic scenes with moving sound
real-time computation of the IRs based on the propagation sources, listener, and scene objects. We can perform in-
paths from each source to the listener. The IR computation teractive sound propagation including specular reflections,
relies on the physical modeling of the sound field based on diffuse reflections, and diffraction of up to 3 orders on a
an accurate description of the scene and material proper- multi-core PC. To the best of our knowledge, RESound is
ties. The actual sensation of sound is due to small varia- the first interactive sound rendering system that can per-
tions in the air pressure. These variations are governed by form plausible sound propagation and rendering in dynamic
the three-dimensional wave equation, a second-order linear virtual environments.
partial differential equation, which relates the temporal and Organization: The paper is organized as follows. We re-
spatial derivatives of the pressure field [40]. Current numer- view the related methods on acoustic simulation in Section
ical methods used to solve the wave equation are limited 2. Section 3 provides an overview of RESound and high-
to static scenes and can take minutes or hours to compute lights the various components. We present the underlying
the IRs. Moreover, computing a numerically accurate so- representations and fast propagation algorithms in Section
lution, especially for high frequencies, is considered a very 4. The reverberation estimation is described in Section 5
challenging problem. and the audio rendering algorithm is presented in Section
Main Results: We present a system (RESound) for in- 6. The performance of our system is described in Section 7.
teractive sound rendering in complex and dynamic virtual In Section 8, we discuss the quality and limitations of our
environments. Our approach is based on geometric acous- system.
tics, which represents acoustic waves as rays. The geo-
metric propagation algorithms model the sound propagation
based on rectilinear propagation of waves and can accurately 2. PREVIOUS WORK
model the early reflections (up to 4 − 6 orders). Many algo- In this section, we give a brief overview of prior work in
rithms have been proposed for interactive geometric sound acoustic simulation. Acoustic simulation for virtual envi-
propagation using beam tracing, ray tracing or ray-frustum ronment can be divided into three main components: sound
tracing [5, 14, 37, 40]. However, they are either limited to synthesis, sound propagation, and audio rendering. In this
static virtual environments or can only handle propagation paper, we only focus on interactive sound propagation and
paths corresponding to specular reflections. audio rendering.
In order to perform interactive sound rendering, we use
fast techniques for sound propagation and audio rendering. 2.1 Sound Synthesis
Our propagation algorithms use a hybrid ray-based repre- Sound synthesis generates audio signals based on interac-
sentation that traces discrete rays [18] and ray-frusta [24]. tions between the objects in a virtual environment. Synthe-
Discrete ray tracing is used for diffuse reflections and frus- sis techniques often rely on physical simulators to generate
tum tracing is used to compute the propagation paths for the forces and object interactions [7, 30]. Many approaches
specular reflections and edge diffraction. We fill in the late have been proposed to synthesize sound from object interac-
reverberations using statistical methods. We also describe tion using offline [30] and online [32, 47, 48] computations.
an audio rendering pipeline combining specular reflections, Anechoic signals in a sound propagation engine can be re-
diffuse reflections, diffraction, 3D sound, and late reverber- placed by synthetically generated audio signal as an input.
ation. Thus, these approaches are complementary to the presented
Our interactive sound rendering system can handle mod- work and could be combined with RESound for an improved
els consisting of tens of thousands of scene primitives (e.g. immersive experience.
2.2 Sound Propagation
Sound propagation deals with modeling how sound waves
propagate through a medium. Effects such as reflections,
transmission, and diffraction are the important components.
(a) (b) (c)
Sound propagation algorithms can be classified into two ap-
proaches: numerical methods and geometric methods. Figure 3: Example scene showing (a) specular, (b)
Numerical Methods: These methods [6, 19, 26, 29] diffraction, and (c) diffuse propagation paths.
solve the wave equation numerically to perform sound propa-
gation. These methods can provide very accurate results but few sound sources in real-time. Recent approaches based on
are computationally expensive. Despite recent advances [31], audio perception [28, 46] and sampling of sound sources [51]
these methods are too slow for interactive applications, and can handle 3D sound for thousands of sound sources.
only limited to static scenes.
Geometric Methods: The most widely used methods
for interactive sound propagation in virtual environments 3. SYSTEM OVERVIEW
are based on geometric acoustics. They compute propa- In this section, we give an overview of our approach and
gation paths from a sound source to the listener and the highlight the main components. RESound simulates the
corresponding impulse response from these paths. Specu- sound field in a scene using geometric acoustics (GA) meth-
lar reflections of sound are modeled with the image-source ods.
method [2, 34]. Image-source methods recursively reflect
the source point about all of the geometry in the scene to 3.1 Acoustic modeling
find specular reflection paths. BSP acceleration [34] and All GA techniques deal with finding propagation paths be-
beam tracing [13, 21] have been used to accelerate this com- tween each source and the listener. The sound waves travel
putation in static virtual environments. Other methods to from a source (e.g. a speaker) and arrive at a listener (e.g.
compute specular paths include ray tracing based methods a user) by traveling along multiple propagation paths repre-
[18, 49] and approximate volume tracing methods [5, 23]. senting different sequences of reflections, diffraction, and re-
There has also been work on complementing specular re- fractions at the surfaces of the environment. Figure 3 shows
flections with diffraction effects. Diffraction effects are very an example of such paths. In this paper, we limit ourselves
noticeable at corners, as the diffraction causes the sound to reflections and diffraction paths. The overall effect of
wave to propagate in regions that are not directly visible these propagation paths is to add reverberation (e.g. echoes)
to the sound source. Two diffraction models are commonly to the dry sound signal. Geometric propagation algorithms
used: the Uniform Theory of Diffraction (UTD) [17] and a need to account for different wave effects that directly influ-
recent formulation of the Biot-Tolstoy-Medwin method [41]. ence the response generated at the listener.
The BTM method is more costly to compute than UTD, and When a small, point like, sound source generates non-
has only recently been used in interactive simulation [35]. directional sound, the pressure wave expands out in a spher-
The UTD, however, has been adapted for use in several in- ical shape. If the listener is set a short distance from the
teractive simulations [3, 42, 44]. source, the wave field eventually encounters the listener.
Another important effect that can be modeled with GA Due to the spreading of the field, the amplitude at the lis-
is diffuse reflections. Diffuse reflections have been shown tener is attenuated. The corresponding GA component is
to be important for modeling sound propagation [9]. Two a direct path from the source to the listener. This path
common existing methods for handling diffuse reflections are represents the sound field that is diminished by distance at-
radiosity based methods [37, 38] and ray tracing based meth- tenuation.
ods [8, 16]. As the sound field propagates, it is likely that the sound
The GA methods described thus far are used to render field will also encounter objects in the scene. These objects
the early reflections. The later acoustic response must also may reflect or otherwise scatter the waves. If the object is
be calculated [15]. This is often done through statistical large relative to the field’s wavelength, the field is reflected
methods [12] or ray tracing [11]. specularly, as a mirror does for light waves. In GA, these
paths are computed by enumerating all possible reflection
2.3 Audio Rendering paths from the source to the listener, which can be a very
Audio rendering generates the final audio signal which can costly operation. There has been much research focused on
be heard by a listener over the headphones or speakers [20]. reducing the cost of this calculation [14], as most earlier
In context of geometric sound propagation, it involves con- methods were limited to static scenes with fixed sources.
volving the impulse response computed by the propagation The delay and attenuation of these contributions helps the
algorithm with an anechoic input audio signal and introduce listener estimate the size of the propagation space and pro-
3D cues in the final audio signal to simulate the direction of vides important directional cues about the environment.
incoming sound waves. In a dynamic virtual environment, Objects that are similar in size to the wavelength may
sound sources, listener, and scene objects may be moving. also be encountered. When a sound wave encounters such
As a result, the impulse responses change frequently and it an object, the wave is influenced by the object. We focus
is critical to generate an artifact-free smooth audio signal. on two such scattering effects: edge diffraction and diffuse
Tsingos [43] and Wenzel et al. [52] describe techniques for reflection.
artifact-free audio rendering in dynamic scenes. Introduc- Diffraction effects occur at the edges of objects and cause
ing 3D cues in the final audio signals requires convolution the sound field to be scattered around the edge. This scatter-
of an incoming sound wave with a Head Related Impulse ing results in a smooth transition as a listener moves around
Response (HRIR) [1, 22]. This can only be performed for a edges. Most notably, diffraction produces a smooth transi-
Figure 2: The main components of RESound: scene preprocessing; geometric propagation for specular,
diffuse, and diffraction components; estimation of reverberation from impulse response; and final audio
rendering.
tion when the line-of-sight between the source and listener is aligned bounding boxes and is updated when the objects in
obstructed. The region behind an edge in which the diffrac- the scene move. This hierarchy is used to perform fast in-
tion field propagates is called the shadow region. tersection tests for discrete ray and frustum tracing. The
Surfaces that have fine details or roughness of the same edges of objects in the scene are also analyzed to determine
order as the wavelength can diffusely reflect the sound wave. appropriate edges for diffraction.
This means that the wave is not specularly reflected, but Interactive Sound Propagation: This stage computes
reflected in a Lambertian manner, such that the reflected the paths between the source and the listener. The direct
direction is isotropic. These diffuse reflections complement path is quickly found by checking for obstruction between
the specular components [9]. the source and listener. A volumetric frustum tracer is used
As the sound field continues to propagate, the number to find the specular and edge diffraction paths. A stochastic
of reflections and scattering components increase and the ray tracer is used to compute the diffuse paths. These paths
amplitude of these components decrease. The initial orders are adjusted for frequency band attenuation and converted
(e.g. up to four or six) of reflection are termed early re- to appropriate pressure components.
flections. These components have the greatest effect on a Audio Rendering: After the paths are computed, they
listener’s ability to spatialize the sound. However, the early need to be auralized. A statistical reverberation filter is es-
components are not sufficient to provide an accurate acous- timated using the path data. Using the paths and the esti-
tic response for any given scene. The later reverberation mated filter as input, the waveform is attenuated by the au-
effects are a function of the scene size [12] and convey an ralization system. The resulting signal represents the acous-
important sense of space. tic response and is output to the system speakers.
Table 1: Performance: Test scene details and the performance of the RESound components.