Bundesen 2015
Bundesen 2015
Bundesen 2015
Vision Research
journal homepage: www.elsevier.com/locate/visres
a r t i c l e i n f o a b s t r a c t
Article history: This article reviews the foundations of the theory of visual attention (TVA) and describes recent develop-
Received 24 July 2014 ments in the theory. TVA is based on the principle of biased competition: All possible visual categoriza-
Received in revised form 10 November 2014 tions ascribing features to objects compete (race) to become encoded into visual short-term memory
Available online 29 November 2014
before it is filled up. Each of the possible categorizations is supported by sensory evidence, but the com-
petition is biased by multiplication with attentional weights (high weights on important objects) and per-
Keywords: ceptual biases (toward use of important categories). The way sensory evidence and attentional biases
Visual attention
interact is specified in the rate and weight equations of TVA, so TVA represents a mathematical formal-
TVA
ization of the biased competition principle. In addition to describing TVA as a psychological theory, we
present the neural interpretation of TVA, NTVA.
Ó 2014 Elsevier Ltd. All rights reserved.
In this article, the foundations of the theory of visual attention, 1.1. Basic assumptions
TVA (Bundesen, 1990), are reviewed and recent developments in
the theory are described. TVA is based on the principle of biased In TVA, both visual recognition and selection of objects in the
competition (Desimone & Duncan, 1995): All possible visual cate- visual field consist in making visual categorizations of the form
gorizations ascribing features to objects compete (race) to become ‘‘object x has feature i’’ or, equivalently, ‘‘object x belongs to cate-
encoded into visual short-term memory before it is filled up. Each gory i.’’ A categorization is made (i.e., selected) when it is encoded
of the possible categorizations is supported by sensory evidence, into visual short-term memory (VSTM). If and when the categori-
but the competition is biased by multiplication with attentional zation is made, object x is said to be selected and also to be recog-
weights (high weights on important objects) and perceptual biases nized as a member of category i. Thus, selection and recognition
(toward use of important categories). The way sensory evidence are viewed as two aspects of the same process.
and attentional biases interact is specified in the rate and weight If and when a visual categorization of an object completes pro-
equations of TVA, so TVA represents a mathematical formalization cessing, the categorization enters VSTM if memory space for the
of the biased competition principle. categorization is available in VSTM. The storage capacity of VSTM
Bundesen (1990) showed how TVA can account for many psy- (parameter K) is normally assumed to be about four independent
chological findings (reaction times and error rates) in the field of objects (see, e.g., Luck & Vogel, 1997; Shibuya & Bundesen,
visual attention. More recently, Duncan et al. (1999) showed how 1988), however, K seems to vary not only between individuals
TVA provides a method to quantify individual variation in atten- but also on a trial-by-trial basis (see Dyrholm et al., 2011). Clearing
tional abilities, which has been used to study many normal and VSTM opens a race among all objects in the visual field to become
clinical populations. Later TVA was extended by Bundesen, encoded into VSTM. An object becomes encoded in VSTM if, and
Habekost, and Kyllingsbæk (2005, 2011) and Bundesen and only if, some categorization of the object becomes encoded in
Habekost (2008) into a neural theory of visual attention (NTVA), VSTM. Thus, each object x may be represented in the encoding race
which accounts for many effects of attention observed in firing by all possible categorizations of the object.
rates of single cells in the visual system of primates.
http://dx.doi.org/10.1016/j.visres.2014.11.005
0042-6989/Ó 2014 Elsevier Ltd. All rights reserved.
C. Bundesen et al. / Vision Research 116 (2015) 210–218 211
attention in time (see, e.g., Nobre, 2001, 2010; Nobre & Rohenkohl, outlined above was extended to multi-element displays.
2014). Beneficial effects of valid temporal expectations on motoric Vangkilde, Coull, and Bundesen (2012) and Vangkilde, Petersen,
responses have been demonstrated repeatedly over the past cen- and Bundesen (2013) explained the effect in terms of TVA by
tury (Woodrow, 1914; for reviews, see Los, 2010; Niemi & assuming that temporal expectations affect perception by chang-
Näätänen, 1981). Vangkilde, Coull, and Bundesen (2012) explored ing perceptual biases (values of b parameters). Specifically, a strong
effects of temporal expectation on visual processing speed in a expectation that a stimulus letter will appear at the next moment
cued single-stimulus recognition paradigm with briefly presented, should yield an increase in the b values of letter types, which
postmasked stimuli. Temporal expectations were manipulated by should speed the recognition of the stimulus letter if it appears
different hazard rate functions for the cue-stimulus foreperiod. when it is highly expected.
For example, in one experiment, the length of the foreperiod from
the cue to the stimulus was distributed exponentially. For each 1.3.3. Components of perceptual bias
block of trials, the participants knew from which of two exponen- Bundesen, Vangkilde, and Habekost (in press) proposed the
tial distributions with different hazard rates the foreperiods would general multiplicative hypothesis that the perceptual bias associ-
be drawn. The hazard rate could be either high (1.33 s1) or low ated with feature i (bi) in itself is a product of three terms:
(0.22 s1) corresponding to mean foreperiods of 750 ms and
bi ¼ Api ui ; ð4Þ
4500 ms, respectively. In either condition, the probability p(t) of
correct report as a function of the stimulus duration (t) was well where A is the level of alertness, pi is the subjective prior probability
described by Eq. (3), t0 being the threshold of conscious perception of being presented with feature i, and ui is the subjective impor-
(the longest ineffective exposure duration), and v(x, i) being the tance (‘‘utility’’) of identifying feature i.
speed of encoding into VSTM at times t > t0. As manipulated by By the rate equation of TVA, the rate of processing of a visual
the hazard rate, temporal expectation had no effect on the thresh- categorization that a given object has feature i is directly propor-
old of conscious perception but strong effect on the speed of sub- tional to bi. The multiplicative structure of Eq. (4) implies that
sequent encoding into VSTM. Averaged across participants, the bi = 0 if, and only if, either A = 0, pi = 0, or ui = 0. That bi = 0 if A = 0
speed of encoding was lowered by 30% in the low hazard rate con- entails the biologically very plausible implication that, if the level
dition. This effect was found even though no general decrease in of alertness is zero, no visual categorizations are made. The impli-
processing speed with time-on-task occurred. Thus, the effect cation that bi = 0 if the subjective prior probability of being pre-
was independent of the actual duration of the foreperiod on a given sented with feature i, pi, is zero, means that no objects are seen
trial, but depended entirely on expectation. These findings were as members of category i if the subject is absolutely certain that
extended in a parametric exploration of the effect of different lev- the category is empty. This would seem to characterize an ideal
els of temporal expectation (Vangkilde, Petersen, & Bundesen, observer. The implication that bi = 0 if the subjective importance
2013). In this study, the length of the cue-stimulus foreperiod of identifying feature i, ui, is zero, also would seem to be a charac-
was exponentially distributed with one of six hazard rates presum- teristic of an ideal observer. Humans may not be ideal observers,
ably giving rise to one of six levels of temporal expectation within a but the hypothetical description of performance given by Eq. (4)
particular block. Again, we found no effect of expectation on the at least provides a benchmark for the analysis of how alertness,
threshold of conscious perception but a strong and remarkably sys- prior knowledge and importance may directly affect human
tematic increase in perceptual processing speed with increasing performance.
expectation. Specifically, to a very good approximation processing
speed could be described by a linearly increasing function of the 1.3.4. Selection from multielement displays
logarithm of the level of expectancy given by the six different haz- TVA provides a mathematical derivation of the fixed-capacity
ard rates (see Fig. 1). Similar effects were recently found by independent race model (FIRM; Shibuya & Bundesen, 1988) for
Sørensen, Vangkilde, and Bundesen (2014) when the design selection from multielement displays consisting of homogeneous
elements (in the sense of Bundesen, 1990, p. 524). According to
FIRM, the processing of a stimulus display occurs in two stages.
At the first stage, an attentional weight is computed for each ele-
Ó Royal Society 2012
(or inefficiency) of top-down selection can be measured by the the processing rate of a target and a distractor in each of the dis-
ratio, a, between the attentional weights of a distractor and a play types is given by C/(T + aD) and aC/(T + aD), respectively.
target: The close fit shown by the smooth curves was obtained with VSTM
capacity K at a value of 3.7 elements, total processing capacity C at
a ¼ wdistractor =wtarget : ð6Þ
49 elements/s, distractibility parameter a (the weight ratio of a dis-
The time it takes to encode an element into VSTM depends on the tractor to a target) at 0.40, and longest ineffective exposure dura-
amount of processing capacity allocated to the element. Specifically, tion t0 at 19 ms. Noninteger values of VSTM capacity K were
the encoding time is assumed to be exponentially distributed with a treated as probability mixtures such that, for example, a value of
rate parameter equal to the amount of processing capacity allocated 3.7 elements for K represented a mixture of values of 3 and 4 with
to the element. Encoding times for different elements are stochasti- a probability of .7 that K = 4 on a given trial.
cally independent, and the selected elements are those elements Many other well-established results have been closely fit by
whose encoding processes complete before (a) the stimulus presen- TVA. The results include findings of stochastic independence
tation terminates and (b) VSTM has been filled up. (Bundesen, Kyllingsbæk, & Larsen, 2003; Kyllingsbæk &
FIRM makes precise predictions of effects of variations in the Bundesen, 2007) in recognition of multiple features, effects of
exposure duration of the stimuli. Shibuya and Bundesen (1988) object integrality (Duncan, 1984), effects of number and spatial
tested such predictions in a comprehensive study of partial report position of targets in studies of divided attention (e.g., Posner,
of digits from mixtures of letters and digits with exposure dura- Nissen, & Ogden, 1978; Sperling, 1960, 1967), effects of selection
tions ranging from 10 ms up to 200 ms. The participants were criterion and the number of distractors in studies of focused atten-
instructed to report as many digits (targets) as possible from the tion (e.g., Bundesen & Pedersen, 1983; Treisman & Gelade, 1980;
stimulus display while ignoring the letters (distractors). Each dis- Treisman & Gormican, 1988), and effects of consistent practice in
play was terminated by a pattern mask. The results from one rep- search (Schneider & Fisk, 1982; also see Kyllingsbæk, Schneider,
resentative participant are illustrated in Fig. 2. For each & Bundesen, 2001).
combination of exposure duration, number of targets, and number
of distractors, the figure shows the probability distribution of the 1.3.5. Attentional dwell time
number of correctly reported targets. Each panel shows the results The majority of TVA-based studies have investigated attention
for a particular combination of the numbers of targets (T) and dis- using brief (e.g., 20–200 ms) exposures of simultaneously pre-
tractors (D). The top curve in the panel shows the probability of sented stimuli. However, Petersen, Kyllingsbæk, and Bundesen
reporting at least 1 target correct as a function of exposure dura- (2012) recently elaborated TVA in the temporal domain by pre-
tion, the second curve from the top shows the probability of senting a computational model (a theory of temporal visual atten-
reporting at least 2 targets correct, and so on. According to FIRM, tion; TTVA) to account for classical findings of attentional
Fig. 2. Relative frequency of scores of j or more correctly reported targets as a function of exposure duration with j, number of targets T, and number of distractors D as
parameters in partial report experiment of Shibuya and Bundesen (1988). Data are shown for Subject MP. Parameter j varies within panels; j is 1 (open circles), 2 (open
squares), 3 (solid squares), 4 (solid circles), or 5 (triangle). T and D vary among panels. Smooth curves represent a theoretical fit to the data by the FIRM model. For clarity,
observed frequencies less than .02 were omitted from the figure. Adapted from ‘‘Visual Selection From Multielement Displays: Measuring and Modeling Effects of Exposure
Duration,’’ by H. Shibuya and C. Bundesen, 1988, Journal of Experimental Psychology: Human Perception and Performance, 14, p. 595.
214 C. Bundesen et al. / Vision Research 116 (2015) 210–218
dynamics when two stimuli are presented in close temporal prox- 1.3.6. Relation to other cognitive domains
imity. Specifically, TTVA accounts for the attentional dwell time Gordon Logan has extended TVA to other cognitive domains
phenomenon, where the report of a masked target (T2) is severely than visual attention. First, Logan (1996) proposed a combination
impaired when it is presented with a delay of less than 500 ms of TVA with the Contour Detector (CODE) theory of perceptual
after a spatially separate masked target (T1) (see Duncan, Ward, grouping (gestalt formation) based on nearness that was proposed
& Shapiro, 1994; Ward, Duncan, & Shapiro, 1996). TTVA assumes by van Oeffelen and Vos (1982, 1983). The resulting theory—the
that a stimulus encoded and retained in VSTM takes up visual pro- CODE Theory of Visual Attention (CTVA)—integrates space-based
cessing resources, which could otherwise have been employed for and object-based approaches to visual attention. Thus, the theory
encoding subsequent stimuli into VSTM. These resources are explains many effects of spatial distance in visual attention (cf.
locked until the stimulus in VSTM has been recoded to a nonvisual Logan, 1996; Logan & Bundesen, 1996). CTVA has been formalized
(e.g., auditory, motoric, or amodal) format, which gives rise to the by Bundesen (1998).
long dwell time. Fig. 3 shows the observed and predicted perfor- Second, Logan and Gordon (2001) extended CTVA to ECTVA: a
mance with T1 and T2 for 3 individual subjects tested by theory of Executive Control of TVA in dual-task situations. This the-
Petersen, Kyllingsbæk, and Bundesen (2012) and for a group of ory accounts for crosstalk, set-switching cost, and concurrence
subjects reported in Duncan, Ward, and Shapiro (1994). costs in dual-task settings. The theory is based on the assumption
Petersen, Kyllingsbæk, and Bundesen (2013) used TTVA to that superordinate, executive processes coordinate and control
explain the well-known finding that removal of the mask of T1 subordinate processes by manipulating their parameters. TVA is
results in a faster recovery of T2 performance (Moore et al., used to describe the subordinate processes, and a task set is
1996; Raymond, Shapiro, & Arnell, 1992). This effect has previously defined as a set of TVA parameters (primarily gs, bs, and ps) that
been linked to a shorter dwell-time on unmasked targets, but suffices to configure a subordinate process to perform a task. Set
Petersen, Kyllingsbæk, and Bundesen (2013) found evidence that switching costs are explained in terms of the time it takes to
both targets and masks lock attentional resources such that masks change parameters and the number of parameters that need to
effectively function as distractors, suggesting that the fast recovery be changed.
of T2 performance may simply be due to T2 not having to compete Finally, by combining ECTVA with the exemplar-based random
with other objects for processing resources after the presentation walk model of categorization proposed by Nosofsky and Palmeri
(and decay) of T1. For stimuli presented in synchrony, TTVA (1997), Logan (2002) created an Instance Theory of Attention and
reduces to TVA, so TTVA explains all behavioral findings previously Memory (ITAM). The model of Nosofsky and Palmeri (1997) is itself
explained by TVA. a combination of Nosofsky’s (1986) generalized context model of
categorization and Logan’s (1988) instance theory of automaticity.
In ITAM, as in Logan’s (1988) instance theory of automaticity, a cat-
0.8 0.8
eye
0.6 0.6
p , p or p
T2
0.4 0.4
T1
0.2 0.2
0 0
0 200 400 600 800 0 200 400 600 800
0.8 0.8
eye
p , p or p
0.6 0.6
T2
T2 (model)
T1 (data)
0.2 0.2 T2 (data)
Eye movements
0 0
0 200 400 600 800 0 200 400 600 800
SOA (ms) SOA (ms)
Fig. 3. Observed T1 (pT1, squares) and T2 (pT2, circles) performances as functions of stimulus onset asynchrony (SOA) for the 3 individual subjects tested by Petersen,
Kyllingsbæk, and Bundesen (2012) and for the group average reported in Duncan, Ward, and Shapiro (1994). Only trials without eye movements were analyzed. Bars indicate
the observed proportion of trials in which eye movements were registered, peye. Error bars indicate the standard deviations of pT1 and pT2 assuming that the responses from
the 3 individual subjects were approximately binomially distributed. Dashed lines (predicted T1 performance) and solid lines (predicted T2 performance) show least squares
fits of TTVA to the four data sets. From ‘‘Measuring and Modeling Attentional Dwell Time’’ by A. Petersen, S. Kyllingsbæk, and C. Bundesen, 2012, Psychonomic Bulletin &
Review, 19, p. 1030.
C. Bundesen et al. / Vision Research 116 (2015) 210–218 215
egory is represented as a set of instances (individual examples or The weight equation of TVA describes how attentional weights
members of the category). Thus, ITAM is an integrated theory of are computed. The computation must occur before processing
attention, memory, and categorization. resources (cells) can be distributed in accordance with the weights.
Therefore, in NTVA, a typical perceptual cycle consists of two
waves: a wave of unselective processing, in which attentional
weights are computed, followed by a wave of selective processing,
2. A neural interpretation of TVA (NTVA)
once processing resources have been allocated in accordance with
the weights. During the first wave, cortical processing resources
2.1. Filtering and pigeonholing at the single-cell level
may be distributed at random (unselectively) across the visual
field. At the end of the first wave, an attentional weight has been
The neural theory of visual attention (NTVA; Bundesen,
computed for each object in the visual field and the weight has
Habekost, & Kyllingsbæk, 2005) provides a neural interpretation
been stored in a priority map. The weights are used for reallocation
of the two central equations of TVA, the rate and weight equations.
of attention (visual processing capacity) by dynamic remapping of
The equations jointly describe two mechanisms of attentional
receptive fields of cortical neurons. The remapping of receptive
selection: one for selection of objects and one for selection of cat-
fields makes the number of neurons allocated to an object increase
egories or, equivalently, features. NTVA specifies how the mecha-
with the attentional weight of the object. Thus, during the second
nisms work at the single-cell level: Filtering (selection of objects)
wave, cortical processing is selective in the sense that the amount
works by changing the number of cortical neurons in which an
of processing resources allocated to an object (the number of neu-
object is represented, whereas pigeonholing (selection of features)
rons that represent the properties of the object) varies with the
works by changing the rate of firing in cortical neurons coding for
attentional weight of the object. Because more processing
particular features (see Fig. 4). As explained below, in NTVA, the
resources are devoted to behaviorally important objects than to
total neural activation representing a visual categorization of the
less important ones, the important objects are processed faster,
form ‘‘object x has feature i’’ is directly proportional to both (a)
and are therefore more likely to become encoded into VSTM. Judg-
the number of neurons representing the categorization, which is
ing from data by Chelazzi et al. (1998, 2001), the second wave of
controlled by filtering, and (b) the level of activation of the individ-
processing may begin about 200 ms after the presentation of the
ual neurons representing the categorization, which is controlled by
stimulus array.
pigeonholing, and the rate equation simply expresses these direct
NTVA assumes that a typical neuron in the visual system is spe-
proportionalities.
cialized to represent a single feature (the feature preferred by the
Filtering makes the number of cells in which an object is repre-
neuron). The feature for which the neuron is specialized can be a
sented increase with the behavioral importance of the object. Thus,
more or less simple physical feature or a microfeature in a distrib-
in NTVA, visual processing occurs in parallel, with differential allo-
uted representation (cf. Hinton, McClelland, & Rumelhart, 1986).
cation of resources so that important objects become represented
NTVA also assumes that a visual neuron responds to the properties
in more cells than do less important objects. Specifically, in NTVA,
of only one object at any given time. In the first wave of processing
the probability that a cortical neuron represents a particular object
the object is selected at random among the objects in the neuron’s
in its classical receptive field equals the attentional weight of the
classical receptive field, but in the second wave of processing, the
object divided by the sum of the attentional weights across all
probability that the neuron represents a particular object equals
objects in the neuron’s receptive field (see Bundesen, Habekost, &
the attentional weight of the object divided by the sum of the
Kyllingsbæk, 2005, for a hypothetical neural network that may
attentional weights across all objects in the receptive field.
be used for the selection of objects).
In NTVA, the activation of a neuron by the appearance of an
object in its receptive field is defined as the increase in firing rate
above a baseline rate representing the spontaneous (undriven)
Ó American Psychological Association 2005
equation of TVA:
wx
v ðx; iÞ ¼ gðx; iÞbi P ;
z2S wz
When the proportion of feature-i coding neurons representing including the frontal eye fields, anterior cingulate cortex, the claus-
object x: trum, and many thalamic nuclei. As also assumed in NTVA, Bisley
wx and Goldberg assume that the priority (attentional weight) of a
P ; stimulus reflects, in part, an evaluation of the subjective impor-
z2S wz
tance of the stimulus. The priorities in the map are thought to be
is smaller than 1, then the total activation representing the catego- used by the oculomotor system to target saccades and by the visual
rization ‘‘object x has feature i’’ is scaled down by multiplication system to guide visual attention. Bisley and Goldberg (2010) seem
with this factor on the right-hand side of the equation. The total to assume that visual attention to objects is a serial process, in
activation representing the categorization also varies in direct pro- which objects are processed one by one, but in NTVA this is only
portion to the level of activation of the individual neurons repre- true in special cases (see Bundesen, 1990, pp. 536–537).
senting the categorization. The bias parameter bi is a scale factor Finally, TVA is closely related to the normalization models of
that multiplies activations of all feature-i coding neurons, so the Reynolds and Heeger (2009; see also Moran & Desimone, 1985;
total activation representing the categorization ‘‘object x has feature Reynolds, Chelazzi, & Desimone, 1999) and Lee and Maunsell
i’’ is also directly proportional to bi. (2009). The model by Reynolds and Heeger was originally sug-
In summary, the neural interpretation of TVA’s rate equation gested as a way of implementing biased competition, which was
essentially reduces to the statement that the total activation repre- inspired by analyses of responses of cortical visual neurons to
senting the categorization ‘‘object x has feature i’’ is directly pro- simultaneously presented pairs of stimuli. In the normalization
portional to both the number of neurons representing the model, the excitatory neural activation caused by the stimuli (the
categorization (which is controlled by filtering) and the level of stimulus drive) is represented in a map that shows the activation
activation of the individual neurons representing the categoriza- as a function of both the locations of the receptive-field centers
tion (which is controlled by pigeonholing). of neurons and their feature preferences. The map of the stimulus
drive first becomes multiplied point-by-point with an attention
2.2. Related models of attention field and then becomes normalized by being divided point-by-
point with the suppressive drive to yield the output neural firing
The rate equation of TVA summarizes the way the strength rates. The suppressive drive is computed by convolving the
g(x, i) of the sensory evidence supporting the categorization ‘‘object point-by-point product of the stimulus drive and the attention
x has feature i’’ interacts with two types of biases: the relative field with a Gaussian kernel. Thus, the division by the suppressive
attentional weight of object x, drive has the effect that the attention-weighted stimulus drive
wx from a preferred stimulus is normalized with respect to the activity
P ; in other neurons that respond to the surrounding spatial and fea-
z2S wz
tural context. This model may be able to explain the way in which
and the perceptual bias associated with feature i, bi, in determining the shape of contrast-response functions for attended and non-
the total competitive strength v(x, i) of the given categorization. As attended stimuli change depending on both the size of the stimu-
explained in the introduction, the rate equation makes TVA a theory lus and the size of the attentional field. The effect of the normali-
of biased competition in the general sense of Desimone and Duncan zation is akin to the effect of using relative (i.e., normalized)
(1995). Of course, TVA is special in assuming two different types of instead of absolute attentional weights in the rate equation of
biases (one associated with objects and one with features) and also TVA. On the other hand, for a pair of adequate stimuli both of
in being formalized. which are presented within the classical receptive field of a
In addition to being a theory of biased competition, NTVA is a recorded cell, the normalization model implies that the response
feature similarity gain model in the sense of Treue and Martinez- (firing rate) of the cell is a weighted average of the responses
Trujillo (1999) and Martinez-Trujillo and Treue (2004). A gain obtained when either stimulus is presented alone (see Reynolds,
model of attention is a model in which attention works by multi- Chelazzi, & Desimone, 1999), whereas NTVA implies that the
plicative scaling of neuronal responses by a certain gain factor response to the stimulus pair is a probability mixture of the
(see McAdams & Maunsell, 1999). Treue and Martinez-Trujillo pro- responses obtained when the stimuli are presented alone (see
posed a fundamental principle, the feature similarity gain princi- Bundesen, Habekost, & Kyllingsbæk, 2005): With probability p,
ple, which says that the gain factor of a neuron increases with the cell responds as if only Stimulus 1 had been presented, and
increasing similarity between the sensory selectivity (the stimulus with probability 1 p, the cell responds as if only Stimulus 2 had
preferences) of the neuron and the currently attended features. been presented.
Thus, assuming that a neuron encodes those features it prefers,
attention to feature i should increase the responses of neurons 2.3. Functional anatomy
encoding feature i or encoding features similar to feature i. In terms
of TVA, the gain factor in question is the multiplicative perceptual NTVA is a fairly abstract neurocomputational model. As empha-
bias bi applied to neurons that are specialized for signalling feature sized by Bundesen, Habekost, and Kyllingsbæk (2005), the model
i. does not depend in a critical way on particular anatomical localiza-
NTVA also implies the saliency or priority map hypothesis pro- tions of the proposed computations. On the other hand, the model
posed by Itti and Koch (2000), Koch and Ullman (1985), Bisley suggests some ways in which visual computations may be distrib-
and Goldberg (2010), and others. By this hypothesis, the visual sys- uted across the human brain. One plausible distribution is given by
tem contains one or more maps of attentional weights or atten- the so-called thalamic version of NTVA, which is illustrated in
tional weight components. Specifically, according to a recent Fig. 5. For further discussions of the anatomy of visual attention,
version of the hypothesis, the lateral intraparietal area (LIP) acts see Bundesen, Habekost, and Kyllingsbæk (2005, 2011), Habekost
as a priority map in which objects are represented by activity pro- and Starrfelt (2009), and Gillebert et al. (2012).
portional to their behavioral priority. Bisley and Goldberg (2010)
point at evidence that the priority map combines bottom-up inputs 2.4. Applications to single-cell studies
from both the traditional dorsal and ventral streams of visual pro-
cessing, including areas V2, V3, V3a, MT, MST, V4, and IT, with top- Quantitative applications of NTVA to attentional effects
down inputs from a wide range of cortical and subcortical areas, observed in single-cell studies can be found in Bundesen,
C. Bundesen et al. / Vision Research 116 (2015) 210–218 217
Martinez-Trujillo, J. C., & Treue, S. (2004). Feature-based attention increases the Reynolds, J. H., Chelazzi, L., & Desimone, R. (1999). Competitive mechanisms
selectivity of population responses in primate visual cortex. Current Biology, 14, subserve attention in macaque areas V2 and V4. Journal of Neuroscience, 19,
744–751. 1736–1753.
Matthias, E., Bublak, P., Costa, A., Müller, H. J., Schneider, W. X., & Finke, K. (2009). Reynolds, J. H., & Heeger, D. J. (2009). The normalization model of attention. Neuron,
Attentional and sensory effects of lowered levels of intrinsic alertness. 61, 168–185.
Neuropsychologia, 47, 3255–3264. Schneider, W., & Fisk, A. D. (1982). Degree of consistent training: Improvements in
Matthias, E., Bublak, P., Müller, H. J., Schneider, W. X., Krummenacher, J., & Finke, K. search performance and automatic process development. Attention, Perception,
(2010). The influence of alertness on spatial and nonspatial components of & Psychophysics, 31, 160–168.
visual attention. Journal of Experimental Psychology: Human Perception and Shibuya, H., & Bundesen, C. (1988). Visual selection from multielement displays:
Performance, 36, 38–56. Measuring and modeling effects of exposure duration. Journal of Experimental
McAdams, C. J., & Maunsell, J. H. R. (1999). Effects of attention on orientation-tuning Psychology: Human Perception and Performance, 14, 591–600.
functions of single neurons in macaque cortical area V4. Journal of Neuroscience, Sørensen, T. A., Vangkilde, S., & Bundesen, C. (2014). Components of attention
19, 431–441. modulated by temporal expectation. Journal of Experimental Psychology:
Moore, C. M., Egeth, H., Berglan, L. R., & Luck, S. J. (1996). Are attentional dwell times Learning, Memory, and Cognition. http://dx.doi.org/10.1037/a0037268.
inconsistent with serial visual search? Psychonomic Bulletin & Review, 3, Sperling, G. (1960). The information available in brief visual presentations.
360–365. Psychological Monographs, 74(11, Whole No. 498).
Moran, J., & Desimone, R. (1985). Selective attention gates visual processing in the Sperling, G. (1967). Successive approximations to a model for short-term memory.
extrastriate cortex. Science, 229, 782–784. Acta Psychologica, 27, 285–292.
Niemi, P., & Näätänen, R. (1981). Foreperiod and simple reaction time. Psychological Townsend, J. T., & Ashby, F. G. (1982). Experimental test of contemporary
Bulletin, 89, 133–162. mathematical models of visual letter recognition. Journal of Experimental
Nobre, A. C. (2001). Orienting attention to instants in time. Neuropsychologia, 39, Psychology: Human Perception and Performance, 8, 834–864.
1317–1328. Townsend, J., & Ashby, F. G. (1983). The stochastic modeling of elementary
Nobre, A. C. (2010). How can temporal expectations bias perception and action? In psychological processes. Cambridge, UK: Cambridge University Press.
A. C. Nobre & J. T. Coull (Eds.), Attention and time (1st ed., pp. 371–392). Oxford Townsend, J. T., & Landon, D. E. (1982). An experimental and theoretical
University Press. investigation of the constant-ratio rule and other models of visual letter
Nobre, A. C., & Rohenkohl, G. (2014). Time for the fourth dimension in attention. In confusion. Journal of Mathematical Psychology, 25, 119–162.
A. C. Nobre & S. Kastner (Eds.), The Oxford handbook of attention (pp. 676–723). Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention.
Oxford University Press. Cognitive Psychology, 12, 97–136.
Nordfang, M., Dyrholm, M., & Bundesen, C. (2013). Identifying bottom-up and top- Treisman, A. M., & Gormican, S. (1988). Feature analysis in early vision: Evidence
down components of attentional weight by experimental analysis and from search asymmetries. Psychological Review, 95, 15–48.
computational modeling. Journal of Experimental Psychology: General, 142, Treue, S., & Martinez-Trujillo, J. C. M. (1999). Feature-based attention influences
510–535. motion processing gain in macaque visual cortex. Nature, 399, 575–579.
Nosofsky, R. M. (1986). Attention, similarity, and the identification– van Oeffelen, M. P., & Vos, P. G. (1982). Configurational effects on the enumeration
categorization relationship. Journal of Experimental Psychology: General, 115, of dots: Counting by groups. Memory & Cognition, 10, 396–404.
39–57. van Oeffelen, M. P., & Vos, P. G. (1983). An algorithm for pattern description on the
Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of level of relative proximity. Pattern Recognition, 16, 341–348.
speeded classification. Psychological Review, 104, 266–300. Vangkilde, S., Coull, J. T., & Bundesen, C. (2012). Great expectations: Temporal
Petersen, A., Kyllingsbæk, S., & Bundesen, C. (2012). Measuring and expectation modulates perceptual processing speed. Journal of Experimental
modeling attentional dwell time. Psychonomic Bulletin & Review, 19, Psychology: Human Perception and Performance, 38, 1183–1191.
1029–1046. Vangkilde, S., Petersen, A., & Bundesen, C. (2013). Temporal expectancy in the
Petersen, A., Kyllingsbæk, S., & Bundesen, C. (2013). Attention dwells on both targets context of a theory of visual attention. Philosophical Transactions of the Royal
and masks. Journal of Vision, 13, 1–12. Society of London, Series B, 368, 20130054. http://dx.doi.org/10.1098/
Posner, M. I., Nissen, M. J., & Ogden, W. C. (1978). Attended and unattended rstb.2013.0054.
processing modes: The role of set for spatial location. In H. L. Pick & I. J. Ward, R., Duncan, J., & Shapiro, K. (1996). The slow time-course of visual attention.
Saltzman (Eds.), Modes of perceiving and processing information (pp. 137–157). Cognitive Psychology, 30, 79–109.
Hillsdale, NJ: Erlbaum. Woodrow, H. (1914). The measurement of attention. Psychological Monographs,
Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual 17(5, Whole No. 76), 1–158.
processing in an RSVP task: An attentional blink? Journal of Experimental
Psychology: Human Perception and Performance, 18, 849–860.