Week 02_Object Perception and Recognition
Week 02_Object Perception and Recognition
Meaning of Sensations
Naveen Kashyap, PhD
Indian Institute of Technology Guwahati
Email: naveen.kashyap@iitg.ernet.in
Object Perception and Pattern Recognition
The process via which sensory inputs are gathered and meaningfully interpreted is called perception.
There are several forms of perception. When we look at an object we acquire specific bits of
information about it (location, kolor, shape, texture etc).
Is it true that we also at the same time when we perceive an object we also acquire information
about its function?
Picture = depth + figure + ground + texture + semicircular arch + semi circular block (door) like + ..
Template Matching – Every event, object or stimulus that we want to derive meaning from
is compared to some previously stored pattern or template. The process of perception in
template matching thus involves comparing incoming information to the templates we have
stored and looking for a match.
Limitations: a) requires a huge database to compare from
b) recognition of new objects
c) people recognize many patterns as more as less same thing
Featural Analysis – Instead of processing stimuli as whole units, we might instead break
them down into their components, using our recognition of those parts to infer what the
whole represents. The parts searched for and recognized are called features. Recognition of
a whole object in this model depends on recognition of its features.
Support for FA model: Studies done on retinas of frogs using microelectrode recording of single
cell revealed that certain stimulus caused these cells to fire more rapidly then certain others. Certain
cells responded strongly to borders between light and dark were called edge detectors, while certain
other cells responded selectively to moving edges were called bug detectors.
Irving Biederman’s (1987) theory of object perception – proposes that when people view
objects they segment them into simple geometric components like geons. Biederman proposed a
total of 36 such primitive components.
Biederman makes an analogy between his theory of object perception
with speech perception using phonemes (which are 44 in number and
are the basic unit of sound like /tS/ as in chair). As an evidence to this
theory Biederman offers the case of any fictional object that non of us
has seen but can try to decipher its parts with considerable agreement
Support for the featural theory also comes from the works of Eleanor Gibson (1969) which proved that
people are more likely to confuse with G & C then with G & F as G & C share the same features of
curved line open to the right.
According to prototype matching model – when a sensory device registers a new stimulus, the device
compares it with previously stored prototypes. An exact match is not required, only an approximate match
is expected. Prototype matching models allows for discrepancies between the input and the prototype. An
object is perceived when a match is found
Where do prototypes come from?
Posner & Keele (1968) demonstrated that people can form prototype very quick. They found that
people during an initial classification task, from some sort of mental representation of each class of
items.
Top-Down Processes
In top-down processing (also called theory-driven or conceptually driven processing) the
perceivers expectation, theories or concepts guide the selection and combination of the
information in the pattern recognition process. (for example)
You know from experience that archways generally mark alleys. When you look down the alley and
see it blocked in black you mostly expect a closed door etc………….
The context in which patterns or objects appear apparently sets up certain expectations in the perceiver
as to what objects will occur. Both accuracy and the length of time needed to recognize an object vary
with the context.
Top-down or conceptually driven processes are directed by expectations derived from context or
past learning or both.
David Marr – presented a computational and most elegant model of perception which involves
both the bottom up and top down process. According to this model visual perception proceeds by
constructing three different mental representations
a) primal sketch – depicts areas of relative brightness and darkness in a 2D images as well as
localized geometric structures. This helps in boundary detection
b) 2 ½ D sketch – using cues such as shading, texture edges and others the viewer derives
what the surfaces are and how they are positioned in depth relative to the viewers vantage
point
c) Final 3D sketch – involves both recognition of what the objects are and understand the
meaning of the visual scene
Perceptual Learning – perception changes with
practice has been well documented (E. J. Gibson, 1969),
and this phenomenon is called perceptual learning.
(Gibson’s original experiment with round coil cards).
Making individuals practice more with perceptual stimuli’s
enable them to learn what aspects of the stimulus to attend
to and try harder to consciously distinguish between
different kinds of stimuli. Using top-down processing the
perceivers experience guides him in selecting the most
optimal features to for more information
Change Blindness -
Change blindness – (Rensink, 2002) is the inability to detect changes to an object or scene, especially
when given different views of that object or scene and it illustrates the top-down nature of perception. The
change blindness paradigm reinforces the idea that perception is driven by expectations about meaning.
Instead of keeping track of every visual detail we instead seem to represent the overall meaning of the
scene.
James Gibson (1979) et.al., adopted an opposite view to the connectionist approach and believed
the perceiver does very little work in perception mainly because the word offers so much information
leaving little need to construction percepts and draw inferences. This view is called Direct Perception.
According to this view the light hitting the retina contains highly organized information that requires
little or no interpretation. In the world that we live in, certain aspects of stimuli remain invariant
despite changes over time or in our physical relationship to them.