AR VR Unit 1 - Introduction
AR VR Unit 1 - Introduction
AR VR Unit 1 - Introduction
INTRODUCTION
Introduction to Virtual Reality and Augmented Reality – Definition – Introduction
to Trajectories and Hybrid Space-Three I’s of Virtual Reality – Virtual Reality Vs
3D Computer Graphics – Benefits of Virtual Reality – Components of VR System –
Introduction to AR-AR Technologies - Input Devices – 3D Position Trackers –
Types of Trackers – Navigation and Manipulation Interfaces – Gesture Interfaces –
Types of Gesture Input Devices – Output Devices – Graphics Display – Human
Visual System – Personal Graphics Displays – Large Volume Displays – Sound
Displays – Human Auditory System.
1. Head-Mounted Display (HMD): The HMD is the primary hardware component of VR. It
typically consists of a screen for each eye, lenses to focus and magnify the display, and
sometimes sensors to track head movement and position.
2. Motion Tracking: To provide an immersive experience, VR systems use sensors to track the
user's head movements in real-time. This tracking allows the virtual environment to respond
as the user looks around and moves.
3. Controllers: VR systems often include handheld controllers that enable users to interact with
the virtual environment. These controllers can have buttons, triggers, touchpads, and even
motion-sensing capabilities.
4. Spatial Audio: Sound is an important part of VR. Spatial audio techniques replicate how
sound behaves in the real world, enhancing the sense of immersion by making sounds appear
to come from specific directions.
Types of Virtual Reality:
1. Non-Immersive or Desktop VR: This type of VR is experienced on a regular computer
screen without the need for specialized headsets. Users interact with the virtual environment
using a keyboard, mouse, or standard game controller.
2. Semi-Immersive VR: These systems use screens or projections to create a partially
immersive experience. Examples include projection-based VR rooms and immersive domes.
3. Fully Immersive VR: This is the most common type of VR, where users wear HMDs that
cover their field of vision and block out the real world entirely. They are fully immersed in
the virtual environment.
Advantages and Applications of Virtual Reality:
1. Gaming: VR gaming offers an unparalleled level of immersion, allowing players to feel like
they're truly inside the game world. Players can interact with the environment and characters
in ways not possible with traditional screens.
2. Training and Simulation: VR is used for training in fields like aviation, medicine, and
military. Simulations can replicate dangerous or complex scenarios without real-world
consequences.
3. Education: VR can take students on virtual field trips, provide immersive history lessons, or
enable interactive exploration of complex scientific concepts.
4. Healthcare: VR is used for pain management, physical therapy, exposure therapy for
phobias, and even in surgery planning.
5. Architecture and Design: Architects and designers use VR to visualize and walk through
virtual models of buildings and spaces before they're constructed.
6. Entertainment and Media: VR films and experiences offer a new way to engage with
storytelling, immersing the viewer in the narrative.
7. Virtual Tourism: VR can provide virtual travel experiences, allowing users to explore
famous landmarks and destinations from the comfort of their homes.
Trajectories
In virtual reality (VR), trajectories refer to the paths that objects, elements, or even users follow
within the virtual environment. These paths can be predefined by designers or developers to guide the
movement of objects or to create specific experiences for users. Trajectories in VR play a crucial role
in shaping the user's interaction and navigation within the virtual world. Here are a few ways
trajectories are utilized in VR:
1. User Movement: Trajectories are used to guide how users move within the virtual
environment. This can include walking, flying, or teleporting. Depending on the VR
experience, trajectories can be linear, branching, or even looped.
2. Object Animation: Objects within the virtual environment can follow predetermined
trajectories to create dynamic and interactive scenes. For example, a virtual character might
follow a trajectory to perform a specific action or movement.
3. Cinematic Experiences: Trajectories are often employed in VR movies and experiences to
guide the viewer's focus and create a directed narrative. The camera's movement along a
trajectory can influence where the user's attention is drawn.
4. Interactive Gameplay: In VR games, trajectories can be used to control the movement of
enemies, projectiles, or other interactive elements, making the gameplay more engaging and
challenging.
5. Training Simulations: Trajectories are crucial in training simulations, where users need to
follow specific paths to learn procedures or practice skills. For instance, a flight simulator
might guide a trainee pilot through the stages of takeoff and landing.
6. Architectural Walkthroughs: In architectural and design contexts, trajectories can be used
to lead users through virtual representations of buildings, giving them a guided tour of the
space.
7. Educational Demonstrations: Trajectories can guide users through educational content or
visualizations, helping them understand complex concepts by providing a structured sequence
of information.
8. Artistic and Creative Experiences: Artists and creators can design unique VR experiences
by using trajectories to control the movement of visual and auditory elements, resulting in
immersive and visually stunning displays.
9. Virtual Tours: Trajectories are commonly used in virtual tours of real-world locations,
allowing users to explore museums, landmarks, or historical sites in a guided manner.
10. Storytelling: In narrative-driven VR experiences, trajectories can guide users through the
story, revealing different scenes or perspectives along the way.
Hybrid space:
Hybrid space in virtual reality (VR) refers to a concept where elements from both the physical world
and the virtual world are combined to create an integrated and interactive experience. It's about
blending the real-world environment with digital content in a way that allows users to interact with
and experience both simultaneously. This concept bridges the gap between the physical and virtual
realms, creating new opportunities for engagement, creativity, and interaction.
Few examples of how hybrid space can be implemented in VR:
1. Interactive Art Installations: Artists can create installations where physical objects are
combined with virtual elements, allowing users to interact with both the tangible and digital
components.
2. Museum Exhibits: Museums can enhance their exhibits by providing VR experiences that
overlay historical context, animations, or additional information onto physical artifacts.
3. Live Performances: Hybrid space can be used to enhance live performances, concerts, or
theater productions by integrating virtual effects, backdrops, or interactive elements with the
physical stage.
4. Architectural Visualization: In architectural design, hybrid space can be used to showcase
buildings or structures with augmented elements that provide insights into construction
processes, energy efficiency, and more.
5. Navigation and Wayfinding: Hybrid space can assist users in navigating complex
environments by overlaying virtual maps, directions, and points of interest onto the physical
surroundings.
6. Training and Simulations: Industries like healthcare and aviation can benefit from hybrid
space VR by allowing trainees to practice procedures in a physical environment while
interacting with virtual simulations.
7. Collaborative Workspaces: Teams can use hybrid VR to collaborate in virtual environments
while being aware of each other's physical presence, enhancing remote teamwork.
8. Education: Educational institutions can use hybrid space to create immersive learning
experiences that combine physical objects with digital content for enhanced understanding.
9. Retail and Shopping: Retailers can offer interactive product demonstrations or virtual try-
ons by superimposing virtual elements onto real-world displays.
10. Entertainment: Hybrid space can be used to create interactive storytelling experiences where
users engage with characters and objects that appear to coexist with the physical environment.
7. Medium:
VR: Can involve a range of mediums, from fully immersive headsets to augmented reality
glasses, depending on the level of immersion desired.
3D Graphics: Primarily presented on traditional screens such as monitors or cinema screens,
although it can also be incorporated into VR experiences.
In summary, while both Virtual Reality and 3D Computer Graphics involve the creation of digital
environments and objects, VR's primary focus is on creating immersive experiences through
interaction and presence, while 3D graphics are more focused on visual representation for various
media.
Input Devices:
The input devices normally associated with a VR system are the 3D mouse and glove. Other esoteric
devices are being developed but only exist as research tools.
3D Mouse:
A 3D mouse is a hand-held device containing a tracker sensor and some buttons, and is used for
navigating or picking objects within a VE. In navigation the orientation of the mouse can be used to
control the forward speed, while the user's gaze direction dictates the direction of travel.
Gloves:
Hand gestures are an intuitive way of controlling a VR systems, and in early systems became very
popular. Unfortunately, they were expensive and earned a reputation for unreliability. A simple
interactive glove is made from a lightweight material into which transducers are sewn to measure
finger joint angles. The transducers can be strain gauges or fibre optics that change their physical
characteristics when they are stretched. Most modern gloves are very accurate and are used to
communicate hand gestures such as pointing and grasping to the host software, and in some cases
return tactile signals to the user's hand.
Position Trackers:
A position sensor is a device that reports its location and/or orientation to the computer.
Typically there is a fixed piece at a known position and additional unit(s) attached to the object being
tracked. Often position sensors are used to track the participant's head and one of the participant's
hands. The position sensor is the most important tracking device of any VR system. Position tracking
tells the VR system where the users are located within a VR space. There are several types of position
sensors, each with its own benefits and limitations. In this section, we will discuss electromagnetic,
mechanical, optical, videometric, ultrasonic, inertial, and neural position-sensing devices.
In position-sensing systems, three things play against one another (besides cost):
(1) accuracy/precision and speed of the reported sensor position
(2) interfering media (e.g., metals, opaque objects)
(3) encumbrance (wires, mechanical linkages).
Types of trackers:
1. Electromagnetic
2. Mechanical
3. Optical
4. Videometric
5. Ultrasonic
6. Inertial
7. Neural
Electromagnetic tracker:
A commonly used VR tracking technology is electromagnetic tracking. This method uses a
transmitter to generate a low-level magnetic field from three orthogonal coils within the unit. In turn,
these fields generate current in another set of coils in the smaller receiver unit worn by the user. The
signal in each coil in the receiver is measured to determine its position relative to the transmitter. The
transmitter unit is fixed at a known location and orientation so that the absolute position of the
receiving unit can be calculated. Multiple receiving units are generally placed on the user (typically
on the head and one hand), on any props used, and sometimes on a handheld device. Electromagnetic
tracking works because the coils act as antennae and the signal weakens as the receiving antennae
move away from the transmitting antennae. Signals are sequentially pulsed from the transmitter
through each of the coil antennae. The strength of the signal also changes based on the relative
orientation between transmitter and receiver coils. Each receiving coil antenna receives a stronger
signal when its orientation is the same as that of the transmitting antenna.
The major advantage is that electromagnetic systems have no line of sight restriction
One of the limitations of electromagnetic tracking systems is that metal in the environment
can cause magnetic interference. Another limitation is the short range of the generated magnetic
field. The receivers will operate with reasonable accuracy only within 3-8 feet of the transmitter,
depending on the specific model. The accuracy drops off substantially as the user moves toward the
edge of the operating range. Using multiple transmitters to extend the range is possible, but difficult to
implement.
Mechanical Tracking:
Tracking may also be accomplished through mechanical means. For example, an articulated arm like
boom may be used to measure the head position. Users can strap part of the device to their heads, or
they can just put their face up to it and grasp the handles. The boom follows their movements within a
limited range, each elbow joint and connecting link of the boom is measured to help calculate the
user's position. The rotational and linear measurements of the mechanical linkages can be made
quickly, accurately, and precisely. Using straightforward matrix mathematics, accurate and precise
position values can be quickly calculated.
The primary disadvantage of this type of system is that the physical linkages restrict the user to a
fixed location in the world. The joints in the boom arm are flexible and allow user movement in the
area within reach of the boom arm. There are some residual effects, because the inertia of a heavy
display can take some effort to move fluidly--especially when a large mass (like a pair of CRT
displays) is attached to the linkages. In addition to being a tracking device, this type of system is
generally used for the head as a visual display or for the hands as a haptic I/0 device, but can't be used
for both head and hands; thus, a second tracking system is usually required if both head- and hand-
tracking are important.
Optical Tracking:
Optical tracking systems make use of visual information to track the user. There are a number of ways
this can be done. The most common is to make use of a video camera that acts as an electronic eye to
"watch" the tracked object or person. The video camera is normally in a fixed location. Computer
vision techniques are then used to determine the object's position based on what the camera "sees." In
some cases, light-sensing devices other than video cameras can be used. When using a single sensing
device, the position of the "watched" point can be reported in only two dimensions; that is, where the
object is in the plane the sensor sees, but without depth information. Watching multiple points or
using multiple sensors allows the system to triangulate the location and/or orientation of the tracked
entity, thereby providing three-dimensional position information.
Single-source, 2-D optical tracking is typically used in second person VR, in which the
participants watch themselves in the virtual world, rather than experiencing the virtual world from the
first person point of. The video source is used both to determine the user's position within the video
picture and to add the user's image to the virtual world.
Another single-source video-tracking method uses a small camera mounted near a desktop
monitor (such as one used for desktop video teleconferencing). This camera can roughly calculate the
user's position in front of the monitor by detecting the outline of the viewer's head.
Multiple visual input sources can be combined by the VR system to garner additional position
information about the participant. Using three visual inputs, such as three video cameras in different
locations, a full 6-DOF position can be calculated by triangulation. By judiciously aiming the
cameras, one can track multiple objects, or multiple body parts (such as each of the hands and the
feet) of a participant.
Limitation:
However, techniques for enabling the system to keep track of which object is which are
complex. A limitation of optical tracking is that the line of sight between the tracked person or object
and the camera must always be clear.
Keeping the tracked object within the sight of the camera also limits the participant's range of
movement.
Videometric (Optical) Tracking:
An alternate method of optical tracking is referred to as videometric tracking. Videometric tracking is
somewhat the inverse of the cases just described in that the camera is attached to the object being
tracked and watches the surroundings, rather than being mounted in a fixed location watching the
tracked object. The VR system analyzes the incoming images of the surrounding space to locate
landmarks and derive the camera's relative position to them. For example, the camera could be
mounted on a head-based display to provide input to the VR system, which would be able to
determine the locations of the corners of the surrounding room and calculate the user's position from
this information.
Ultrasonic Tracking:
Ultrasonic tracking uses high-pitch sounds emitted at timed intervals to determine the
distance between the transmitter (a speaker) and the receiver (a microphone). As with optical tracking,
three transmitters combined with three receivers provide enough data for the system to triangulate the
full 6-DOF position of an object. Because this method of tracking relies on such common technology
as speakers, microphones, and a small computer, it provides a fairly inexpensive means of position
tracking. Logitech, a manufacturer of this technology, has even embedded ultrasonic trackers directly
in the frame of shutter glasses, providing an economical system for monitor-base VR systems
Limitation:
Properties of sound do limit this method of tracking. Tracking performance can be degraded
when operated in a noisy environment. The sounds must have an unobstructed line between the
speakers and the microphones to accurately determine the time (and therefore distance) that sound
travels between the two. Trackers built around this technology generally have a range of only a few
feet
Another limitation of ultrasonic tracking is that triangulating a position requires multiple,
separate transmitters and receivers. These transmitters and receivers must be separated by a certain
minimum distance. This is not generally a problem for transmitters, which can be mounted throughout
the physical environment, but can be a problem for receivers.
Inertial Tracking:
Inertial tracking uses electromechanical instruments to detect the relative motion of sensors
by measuring change in gyroscopic forces, acceleration, and inclination. These instruments include
accelerometers which are devices that measure acceleration. Thus accelerometers can be used to
determine the new location of an object that has moved, if you know where it started. Another
instrument is the inclinometer, which measures inclination, or how tipped something is with respect to
its "level" position. It is very much like a carpenter's level except that the electrical signal it provides
as its output can be interpreted by a computer.
The primary benefit is that they are self-contained units that require no complementary
components fixed to a known location, so there is no range limitation. They move freely with the user
through a large space. They work relatively quickly compared with many of the other tracking
methods
Limitation of Inertial tracking is seldom used in conjunction with stationary visual displays,
because knowledge of the user's head location is required. Inertial tracking by itself does not provide
enough information to determine location.
Neural (Muscular) Tracking:
Neural or muscular tracking is a method of sensing individual body-part movement, relative
to some other part of the body. It is not appropriate for tracking the location of the user in the venue,
but it can be used to track movement of fingers or other extremities. Small sensors are attached to the
fingers or limbs, with something like a Velcro strap or some type of adhesive to hold the sensor in
place (FIGURE 3-6). The sensor measures nerve signal changes or muscle contractions and reports
the posture of the tracked limb or finger to the VR system.
Wayfinding
Wayfinding refers to methods of determining (and maintaining) awareness of where one is located.
The goal of wayfinding is to help the traveler know where they are in relationship to their destination
and to be able to determine a path to arrive there. A major step in achieving this is through the
development of a cognitive map or mental model of the environment through which one is traversing
or plans to traverse.
Travel
Travel is the crucial element that accords them the ability to explore the space. Simple VR interfaces
that constrain the user to a small range of physical motion or to the manipulation of only those objects
within view are not adequate for experiencing most virtual worlds.
In many respects, using physical movement as a means of travel through a virtual space seems quite
natural. After all, for many VR developers, the goal is an interface that mimics physical interactions
with the real world.. A young child can learn (even before crawling, walking, or talking) that by
pointing in a given direction while riding in a stroller or on a shoulder, they might be taken in that
direction. From there, the child progresses through scooters, tricycles, bicycles, and skateboards that
translate leg motion to forward or backward movement in a direction controlled by the arms. This is
augmented later in life as various machines allow us to traverse land, sea, and air with relative ease.
Some travellers learn to traverse the world via a joystick, riding in a motorized wheelchair. Many
more learn to use a joystick, button, or mouse to journey through computer-based worlds.
Manipulation Methods:
1. Direct user control: interface gestures that mimic real world interaction
Direct user control is a method of manipulation in which the participant interacts with objects
in the virtual world just as they would in the real world. Many direct user interactions combine
the object selection process with the actual manipulation.
1. Visual Feedback:
Visual feedback devices provide users with visual cues and responses that correspond to their gestures
and interactions. These cues can include animations, particle effects, or changes in the virtual
environment. Examples include:
Virtual Object Reactions: When users interact with virtual objects, the objects might change color,
shape, or behavior to indicate the interaction.
Glow and Lighting Effects: Virtual objects might emit light or glow in response to user gestures,
drawing attention to interactions.
Trail Effects: When users move their hands quickly, virtual trails or traces can follow their
movements, creating a dynamic visual effect.
2. Haptic Feedback:
Haptic feedback devices simulate touch and physical sensations, enhancing immersion and allowing
users to "feel" virtual objects. While haptic gloves were mentioned in the previous response, they can
also serve as output devices, providing haptic responses that correspond to the user's gestures and
interactions. Examples include:
Vibration: Haptic gloves or controllers can vibrate to simulate the sensation of touching or
interacting with virtual objects.
Resistance: Some haptic gloves can exert resistance or pressure on the user's fingers, simulating the
feeling of gripping or manipulating virtual objects.
3. Auditory Feedback:
Auditory feedback devices provide sound cues and responses that match users' interactions and
gestures. These cues can enhance the sense of presence and provide additional context. Examples
include:
Sound Effects: Virtual objects or actions can trigger sound effects that correspond to the user's
gestures. For instance, a swiping motion might produce a swooshing sound.
4. Particle Effects:
Particle effects are visual and auditory cues that can be triggered by user interactions. These effects
can include sparks, trails, or bursts of light and sound that add dynamism to the VR environment.
5. Object Animation:
In response to user gestures, virtual objects might animate, transform, or move in ways that
correspond to the interaction. For example, a virtual door might swing open when the user makes a
pushing gesture.
6. Dynamic Environment Changes:
User gestures can trigger changes in the virtual environment itself. For instance, a user's hand
movement might cause a wave to ripple across a virtual pond or make leaves on a tree rustle.
Gesture output devices are crucial for creating a sense of realism and feedback within the virtual
environment.
Graphics display
Graphics display in virtual reality (VR) is a critical component that directly influences the quality and
realism of the VR experience. VR graphics display involves creating and rendering visual content that
is presented to users through VR headsets, allowing them to perceive and interact with a virtual
environment. Here's how graphics display works in VR:
1. 3DScreen:
Stereographics manufacture a polarizing panel that transforms a projection system or computer
monitor into a 3D display monitor. The ZScreen operates somewhat like shutter glasses: when a left
image is displayed on the monitor the image is polarized, say horizontally, and when the right image
is displayed the polarization is vertical. Now if the viewer is wearing a pair of polarized glasses, their
left and right eyes will see a sequence of corresponding left and right views of a scene.
2. HMD:
HMDs possess a variety of characteristics such as contrast ratio, luminance, field of view (FaV), exit
pupil, eye relief, and overlap. The contrast ratio is the ratio of the peak luminance to the background
luminance, and a value of 100: 1 is typical. Luminance is a measure of a screen's brightness, and
ideally should exceed 1000 cd/m2• The FaV is a measure of the horizontal and vertical visual range of
the optical system, and ideally should approach that of the human visual system. In general though,
most HMDs may only provide about 60° FaV for each eye. The exit pupil is the distance the eye can
deviate from the optical centre of the display before the image disappears. This is in the order of 1.2
cm. Eye relief is a measure of the distance between the HMDs optical system and the user's face, and
is in the order of 2 cm. And finally, overlap is a measure of the image overlap to create a stereoscopic
image.
3. BOOM:
These are stereo-devices supported by a counterbalanced arm, and employ high-resolution CRT
technology. The user, who could be standing or seated manoeuvres the BOOM display using side
grips into some convenient view of the VE. As the BOOM is moved about, joint angles in the
articulated arm are measured to enable the 3D position of the BOOM to be computed. As this can be
undertaken in real time, the position is supplied to the host computer to fix the viewpoint of the VE.
4. Retinal Displays:
The Human Interfaces Laboratory at the University of Washington is still developing a retinal display
that directs laser light direct onto the eye's retina. To date an 800-line monochrome system has been
constructed and work is underway to build a full colour system that works at an even higher
resolution.
5. Virtual Table:
Virtual tables are an excellent idea and consist of a glass or plastic screen that forms a tabletop. Inside
the table a projector displays an image of a VE onto the back of the screen with alternating left and
right images. With the aid of shutter glasses and head tracking one obtains an excellent 3D view of the
VE. The viewpoint is determined by the viewer and is strictly a one-person system; however, it is
possible for two or more people to stand close together and share a common view. It has many uses
especially for military, architectural, and medical applications.
6. CAVE:
A CAVE is constructed from a number of back-projection screens with external projectors projecting
their images. Inside the CAVE a viewer is head tracked and wears shutter glasses, so that wherever he
or she looks, a stereoscopic view is seen. The degree of immersion is very high and like the virtual
table is strictly a one-person system; however, the room is normally large enough to allow other
observers to share in the experience.
Sound Displays
"Sound displays" in the context of VR typically refer to technologies and techniques that
enhance the auditory experience within virtual reality environments. These technologies aim to create
a more immersive and realistic sound environment, making virtual experiences feel more lifelike and
engaging. Here are some aspects and technologies related to sound displays in VR:
1. Spatial Audio: Spatial audio is a crucial component of VR sound displays. It involves creating a
sense of directionality and distance for virtual sounds. By using advanced audio algorithms and
positioning techniques, VR systems can simulate how sound would behave in a real-world
environment, allowing users to perceive where sounds are coming from and how they move in
relation to their position.
2. Binaural Audio: Binaural audio is a technique that uses two microphones to capture sound in a
way that replicates human hearing. When these binaural recordings are played back through
headphones, they create a convincing illusion of three-dimensional sound. This technique is often
used in VR to enhance the sense of presence and immersion.
3. Head-Related Transfer Function (HRTF): HRTF is a set of measurements that capture how a
sound is filtered by the listener's anatomy, particularly the shape of their ears and head. By
applying personalized HRTF data to audio signals, VR systems can tailor the sound experience to
match an individual's unique perception of spatial audio, making the virtual environment sound
more natural.
4. Ambisonics: Ambisonics is a sound recording and reproduction technique that captures a
spherical sound field. In VR, this allows for a full 360-degree sound experience, making sounds
appear to come from any direction. Ambisonics can provide a seamless sense of immersion by
matching the audio with the visual environment.
5. Haptic Audio Feedback: Some VR systems incorporate haptic feedback into the audio
experience. This involves using vibrations or tactile sensations to complement the audio cues. For
example, if a virtual object collides with another, users might feel a slight vibration in their
controllers or headset, adding a multisensory layer to the experience.
6. Real-time Audio Processing: To ensure that audio remains synchronized with the user's
movements and interactions, VR systems often use real-time audio processing. This involves
adjusting audio sources on-the-fly as the user moves, turns, or interacts with virtual objects,
maintaining a consistent and convincing audio environment.
7. Dynamic Sound Environments: VR experiences can create dynamic soundscapes that change
based on user actions and environmental factors. For instance, walking closer to a waterfall might
make the sound of rushing water louder, or moving into a virtual building might result in a change
in the way sound echoes.
8. Personalized Audio: Some VR systems offer personalized audio profiles that consider factors
like the user's hearing sensitivity and preferences. This ensures that the sound display is optimized
for each individual, enhancing their overall experience.
Overall, sound displays play a critical role in immersing users in virtual environments by creating a
realistic and engaging auditory experience that complements the visual aspects of VR. As VR
technology advances, we can expect even more sophisticated and immersive sound display techniques
to emerge.
Human Auditory System.
Understanding the human auditory system is crucial for creating immersive and realistic virtual reality
(VR) experiences. The human auditory system is responsible for processing sound, determining its
direction and distance, and providing a sense of spatial awareness. Here's how the auditory system
works and its relevance in VR:
1. Ear:
For our purposes it is convenient to divide the ear into three parts:
The outer ear, middle ear, and the inner ear.
The outer ear consists of the pinna, which we normally call the ear. It has quite a detailed shape and
plays an important role in capturing sound waves, but more importantly, it shapes the spectral
envelope of incident sound waves. This characteristic will be explored further when we examine
sound direction.
The middle ear consists of the tympanic membrane (the eardrum) and the ossicular system, which
conducts sound vibrations to the inner ear via a system of interconnecting hones.
The cochlea is located in the inner ear, and is responsible for discriminating loudness and frequency.
Finally, signals from the cochlea are interpreted by the brain's auditory cortex as the sensation of
sound.
Auditory Nerve: The electrical signals generated by the hair cells are transmitted via the auditory
nerve to the brainstem and then to the auditory cortex in the brain. The brain processes these signals to
interpret the pitch, loudness, direction, and distance of sounds.
3. Direction
When a door bangs shut, pressure waves spread through the air and eventually impinge upon our ears.
Perhaps, the left ear may detect the pressure waves before the right ear, and as our ears are
approximately 20 cm apart, a time delay of 0.6 ms arises, that can be detected by the brain's auditory
cortex. This obviously is something the brain uses to assist in locating the horizontal source of the
sound, but it does not explain how we locate the source in the vertical plane. The interaction of the
sound wave with our head and the pinnae plays significant role in shaping its spectral content. Sounds
from different directions are influenced differently by the geometry of our head and ears, and our
brains are able to exploit this interaction to localize the sound source.
2. Colour Receptors
The retina contains two types of light-sensitive cells called rods and cones. The rods are sensitive to
low levels of illumination and are active in night vision; however, they do not contribute towards our
sense of color - this is left to the cone cells that are sensitive to three overlapping regions of the visible
spectrum: red, green, and blue (RGB). Collectively, they sample the incoming light frequencies and
intensities, and give rise to nerve impulses that eventually end up in the form of a coloured image.
The fovea is the center of the retina and is responsible for capturing fine-coloured detail - the
surrounding area is still sensitive to color but at a reduced spatial resolution. The fovea enables us to
perform tasks such as reading, and if we attempt to read outside of this area, the image is too blurred
to resolve any useful detail. The fovea is only 0.1 mm in diameter, which corresponds to
approximately 10 of the eye's field of view (FOV). Towards the edge of the retina the cells become
very sensitive to changes in light intensity, and provide peripheral vision for sensing movement.
3. Visual Acuity
Acuity is a measure of the eye's resolving power, and because the density of cells varies across the
retina, measurements are made at the fovea. An average eye can resolve two bright points of light
separated 1.5 mm at a distance of 10m. This corresponds to 40 s or arc, and is equivalent to a distance
of 2 micro meters on the retina.
4. The Blind Spot
The brain is connected to each eye via an optic nerve that enters through the back of the eye to
connect to the retina. At the point of entry the distribution of rods and cones is sufficiently disturbed
to create an area of blindness called the blind spot, but this does not seem to cause us any problems.
The blind spot is easily identified by a simple experiment. To begin with, close one eye - for example
the right eye - then gaze at some distant object with the left eye. Now hold up your left-hand index
finger at arm's length slightly left of your gaze direction. While still looking ahead, move your index
finger about slowly. You will see the fingertip vanish as it passes over the blind spot. What is strange
is that although the image of the finger disappears, the background information remains. It is just as
well that this does not create problems for us in the design of head-mounted displays (HMDs).
5. Stereoscopic Vision
If we move towards an object, the ciliary muscles adjust the shape of the lens to accommodate the
incoming light waves to maintain an in-focus image. Also, the eyes automatically converge to ensure
that the refracted images fall upon similar areas of the two retinas. This process of mapping an image
into corresponding positions upon the two retinas is the basis of stereoscopic vision. The difference
between the retinal images is called binocular disparity and is used to estimate depth, and ultimately
gives rise to the sense of three dimensional (3D) (see Fig. 4.2).
6. Stereopsis Cues
In 1832 Charles Wheatstone showed that a 3D effect could be produced by viewing two two-
dimensional (2D) images using his stereoscope. Brewster went onto perfect the device using prisms
instead of the mirrors used by Wheatstone. The 3D image formed by two separate views, especially
those provided by the eyes, enable us to estimate the depth of objects, and such cues are called
stereopsis cues.