Background Substract I On

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO.

4, APRIL 2006 657

A Texture-Based Method for Modeling the computed over a larger area than a single pixel. This approach
Background and Detecting Moving Objects provides us with many advantages and improvements compared to
the state-of-the-art. Our method tries to address all the issues
Marko Heikkilä and mentioned earlier except the handling of shadows which turned out
to be an extremely difficult problem to solve with background
Matti Pietikäinen, Senior Member, IEEE
modeling. In [4], a comprehensive survey of moving shadow
detection approaches is presented.
Abstract—This paper presents a novel and efficient texture-based method for
modeling the background and detecting moving objects from a video sequence.
Each pixel is modeled as a group of adaptive local binary pattern histograms that 2 RELATED WORK
are calculated over a circular region around the pixel. The approach provides us
with many advantages compared to the state-of-the-art. Experimental results
A very popular technique is to model each pixel in a video frame
clearly justify our model. with a Gaussian distribution. This is the underlying model for many
background subtraction algorithms. A simple technique is to
Index Terms—Motion, texture, background subtraction, local binary pattern. calculate an average image of the scene, to subtract each new video
frame from it and to threshold the result. The adaptive version of
æ this algorithm updates the model parameters recursively by using a
1 INTRODUCTION simple adaptive filter. This single Gaussian model was used in [5].
The previous model does not work well in the case of dynamic
BACKGROUND subtraction is often one of the first tasks in machine
natural environments since they include repetitive motions like
vision applications, making it a critical part of the system. The
swaying vegetation, rippling water, flickering monitors, camera
output of background subtraction is an input to a higher-level
process that can be, for example, the tracking of an identified object. jitter, etc. This means that the scene background is not completely
The performance of background subtraction depends mainly on the static. By using more than one Gaussian distribution per pixel, it is
background modeling technique used. Natural scenes especially possible to handle such backgrounds. In [6], the mixture of
put many challenging demands on background modeling since they Gaussians approach was used in a traffic monitoring application.
are usually dynamic in nature including illumination changes, The model for pixel intensity consisted of three Gaussian distribu-
swaying vegetation, rippling water, flickering monitors, etc. A tions corresponding to the road, vehicle, and shadow distributions.
robust background modeling algorithm should also handle situa- One of the most commonly used approaches for updating the
tions where new objects are introduced to or old ones removed from Gaussian mixture model was presented in [7]. Instead of using the
the background. Furthermore, the shadows of the moving and scene exact EM algorithm, an online K-means approximation was used.
objects can cause problems. Even in a static scene frame-to-frame Many authors have proposed improvements and extensions to this
changes can occur due to noise and camera jitter. Moreover, the algorithm. In [8], new update algorithms for learning mixture
background modeling algorithm should operate in real-time. models were presented. They also proposed a method for
A large number of different methods for detecting moving detecting moving shadows using an existing mixture model. In
objects have been proposed and many different features are [9], not only the parameters but also the number of components of
utilized for modeling the background. Most of the methods use the mixture is constantly adapted for each pixel. In [10], the
only the pixel color or intensity information to make the decision. mixture of Gaussians model was combined with concepts defined
To the authors’ knowledge, none of the earlier studies have by region level and frame level considerations.
utilized discriminative texture features in dealing with the The Gaussian assumption for the pixel intensity distribution
problem. Only some simple statistics of neighborhoods may have does not always hold. To deal with the limitations of parametric
been considered. This is maybe due to the high-computational methods, a nonparametric approach to background modeling was
complexity and limited performance of texture methods. proposed in [11]. The proposed method utilizes a general
In this paper, we propose an approach that uses discriminative nonparametric kernel density estimation technique for building a
texture features to capture background statistics. An early version of statistical representation of the scene background. The probability
the method based on block-wise processing was presented in [1]. For density function for pixel intensity is estimated directly from the
our method, we chose the local binary pattern (LBP) texture operator data without any assumptions about the underlying distributions.
[2], [3], which has recently shown excellent performance in many In [12], a quantization/clustering technique to construct a
nonparametric background model was presented. The background
applications and has several properties that favor its usage in
is encoded on a pixel by pixel basis and samples at each pixel are
background modeling. Perhaps the most important properties of the
clustered into the set of codewords.
LBP operator are its tolerance against illumination changes and its
Some presented background models consider the time aspect of
computational simplicity. In order to make the LBP even more
a video sequence. The decision depends also on the previous pixel
suitable to real-world scenes, we propose a modification to the
values from the sequence. In [13], [14], an autoregressive process
operator and use this modified LBP throughout this paper. Unlike
was used to model the pixel value distribution over time. In [15], a
most other approaches, the features in background modeling are
Hidden Markov Model (HMM) approach was adopted.
Some authors have modeled the background using edge
. The authors are with the Machine Vision Group, Infotech Oulu, features. In [16], the background model was constructed from
Department of Electrical and Information Engineering, University of the first video frame of the sequence by dividing it into equally
Oulu, PO Box 4500, 90014, Finland. E-mail: {markot, mkp}@ee.oulu.fi. sized blocks and calculating an edge histogram for each block. The
Manuscript received 19 Jan. 2005; revised 22 June 2005; accepted 22 Aug. histograms were constructed using pixel-specific edge directions
2005; published online 14 Feb. 2006.
as bin indices and incrementing the bins with the corresponding
Recommended for acceptance by P. Fua.
For information on obtaining reprints of this article, please send e-mail to: edge magnitudes. In [17], a fusion of edge and intensity
tpami@computer.org, and reference IEEECS Log Number TPAMI-0041-0105. information was used.
0162-8828/06/$20.00 ß 2006 IEEE Published by the IEEE Computer Society
658 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 4, APRIL 2006

neighboring pixels are very close to the value of the center pixel.
This is due to the thresholding scheme of the operator. Think of a
case where gp and gc have the values 29 and 30, respectively. From
(1), we see that, in this case, sðxÞ outputs a value of 0. If the values
were 30 and 30, sðxÞ would return with 1. In order to make the LBP
more robust against these negligible changes in pixel values, we
propose to modify the thresholding scheme of the operator by
replacing the term sðgp  gc Þ in (1) with the term sðgp  gc þ aÞ. The
Fig. 1. Calculating the binary pattern. bigger the value of jaj is, the bigger changes in pixel values are
allowed without affecting the thresholding results. In order to
Motion-based approaches have also been proposed for back- retain the discriminative power of the LBP operator, a relatively
ground subtraction. The algorithm presented in [18] detects salient small value should be used. In our experiments, a was given a
motion by integrating frame-to-frame optical flow over time. Salient value of 3. The results presented in [1] show that good results can
motion is assumed to be motion that tends to move in a consistent also be achieved by using the original LBP. With the modified
direction over time. The saliency measure used is directly related to version, our background subtraction method consistently behaves
the distance over which a point has traveled with a consistent more robustly and thus should be preferred over the original one.
direction.
Region-based algorithms usually divide an image into blocks 4 A TEXTURE-BASED APPROACH
and calculate block-specific features. Change detection is achieved
In this section, we introduce our approach to background subtrac-
via block matching. In [19], the block correlation is measured using
tion. The algorithm can be divided into two phases, background
the NVD (Normalized Vector Distance) measure. In [16], an edge
modeling and foreground detection, described in Sections 4.1 and 4.2. In
histogram calculated over the block area is used as a feature vector
Section 4.3, some guidelines for how to select the parameter values
describing the block.
are given.

3 TEXTURE DESCRIPTION WITH LOCAL BINARY 4.1 Background Modeling


PATTERNS Background modeling is the most important part of any background
subtraction algorithm. The goal is to construct and maintain a
LBP is a gray-scale invariant texture primitive statistic. It is a statistical representation of the scene that the camera sees. As in
powerful means of texture description. The operator labels the most earlier studies, the camera is assumed to be nonmoving. We
pixels of an image region by thresholding the neighborhood of model each pixel of the background identically, which allows for a
each pixel with the center value and considering the result as a high-speed parallel implementation if needed. We chose to utilize
binary number (binary pattern). The basic version of the texture information when modeling the background and the LBP
LBP operator considers only the eight neighbors of a pixel [2], was selected as the measure of texture because of its good properties.
but the definition can be easily extended to include all circular In the following, we explain the background model update
neighborhoods with any number of pixels [3]: procedure for one pixel, but the procedure is identical for each pixel.
X
P 1  We consider the feature vectors of a particular pixel over time
1 x0
LBPP ;R ðxc ; yc Þ ¼ sðgp  gc Þ2p ; sðxÞ ¼ ð1Þ as a pixel process. The LBP histogram computed over a circular
0 x < 0;
p¼0 region of radius Rregion around the pixel is used as the feature
vector. The radius Rregion is a user-settable parameter. The
where gc corresponds to the gray value of the center pixel ðxc ; yc Þ of a
background model for the pixel consists of a group of adaptive
local neighborhood and gp to the gray values of P equally spaced
LBP histograms, fm ~ K1 g, where K is selected by the user.
~0 ; . . . ; m
pixels on a circle of radius R. By extending the neighborhood, one
Each model histogram has a weight between 0 and 1 so that the
can collect larger-scale texture primitives. The values of neighbors
weights of the K model histograms sum up to one. The weight of
that do not fall exactly on pixels are estimated by bilinear
the kth model histogram is denoted by !k .
interpolation. In practice, (1) means that the signs of the differences
Let us denote the LBP histogram of the given pixel computed
in a neighborhood are interpreted as a P -bit binary number, ~. At the first stage of processing, h
~ is
from the new video frame by h
resulting in 2P distinct values for the binary pattern. The 2P -bin
compared to the current K model histograms using a proximity
histogram of the binary patterns computed over a region is used for
measure. We chose to use the histogram intersection as the
texture description. See Fig. 1 for an illustration of the LBP operator.
measure in our experiments:
LBP has several properties that favor its usage in background
modeling. Due to the invariance of the LBP features with respect to X
N1
monotonic gray-scale changes, our method can tolerate the a; ~
\ ð~ bÞ ¼ minðan ; bn Þ; ð2Þ
n¼0
considerable gray-scale variations common in natural images and
no normalization of input images is needed. Unlike many other where ~ a and ~ b are the histograms and N is the number of
approaches, the proposed features are very fast to compute, which histogram bins. This measure has an intuitive motivation in that it
is an important property from the practical implementation point calculates the common part of two histograms. Its advantage is that
of view. LBP is a nonparametric method, which means that no it explicitly neglects features which only occur in one of the
assumptions about the underlying distributions are needed. histograms. The complexity is very low as it requires very simple
Furthermore, the operator does not require many parameters to operations only. The complexity is linear for the number of
be set and has high discriminative power. histogram bins: OðNÞ. Notice that it is also possible to use other
A limitation of the LBP is that it does not work very robustly on measures such as X 2 . The threshold for the proximity measure, TP ,
flat image areas such as sky, where the gray values of the is a user-settable parameter.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 4, APRIL 2006 659

If the proximity is below the threshold TP for all model


histograms, the model histogram with the lowest weight is
replaced with h~. The new histogram is given a low initial weight.
In our experiments, a value of 0.01 was used. No further
processing is required in this case.
More processing is required if matches were found. We select
the best match as the model histogram with the highest proximity
value. The best matching model histogram is adapted with the new
data by updating its bins as follows:

m ~ þ ð1  b Þm
~ k ¼ b h ~k ; b 2 ½0; 1; ð3Þ

where b is a user-settable learning rate. Furthermore, the weights


of the model histograms are updated:

!k ¼ w Mk þ ð1  w Þ!k ; w 2 ½0; 1; ð4Þ

where w is another user-settable learning rate and Mk is 1 for the


best matching histogram and 0 for the others. The adaptation speed
of the background model is controlled by the learning rate Fig. 2. Some detection results of our method. The first row contains the original
video frames. The second row contains the corresponding processed frames. The
parameters b and w . The bigger the learning rate, the faster the
image resolution is 320  240 pixels.
adaptation.
All of the model histograms are not necessarily produced by the
A proper value for the parameter K depends on the scene being
background processes. We can use the persistence of the histogram
modeled. In case of a unimodal scene, a small value for K is
in the model to decide whether the histogram models the
sufficient. For multimodal scenes, more histograms are needed to
background or not. It can be seen from (4) that the persistence is learn the background. According to the experiments, in most cases,
directly related to the histogram’s weight: The bigger the weight, a good value for K lies between 2 and 5. It should be noticed that
the higher the probability of being a background histogram. As a the bigger the value of K, the slower the processing, and the
last stage of the updating procedure, we sort the model histograms greater the memory requirements. The parameter TB is closely
in decreasing order according to their weights, and select the first related to K. It is used to select the background histograms as
B histograms as the background histograms: described in Section 4.1. A small value is sufficient for unimodal
!0 þ . . . þ !B1 > TB ; TB 2 ½0; 1; ð5Þ scenes, whereas a larger value is required in the multimodal case.
The proposed method adapts itself to the changes of the scene
where TB is a user-settable threshold. by continuously updating the background model. The adaptation
speed is controlled by two learning rate parameters b and w ,
4.2 Foreground Detection
shown in (3) and (4), respectively. The bigger the parameter values,
Foreground detection is done before updating the background
the more weight is given to recent observations and the faster the
model. The histogram h ~ is compared against the current
adaptation. According to the experiments, in most of the cases, the
B background histograms using the same proximity measure as best results are achieved with small values for these parameters.
in the update algorithm. If the proximity is higher than the
threshold TP for at least one background histogram, the pixel is
classified as background. Otherwise, the pixel is marked as 5 EXPERIMENTS
foreground. To the authors’ knowledge, there is still a lack of globally accepted
test sets for the extensive evaluation of background subtraction
4.3 Selection of Parameter Values
algorithms. Due to this fact, most authors have used their own test
Because of the huge amount of different combinations, finding a sequences. This makes the comparison of the different approaches
good set of parameter values must be done more or less rather difficult. In [13], an attempt to address this problem was
empirically. In this section, some guidelines for how to select made. In most cases, the comparison of different approaches is
values for the parameters of the method are given. done based on visual interpretation, i.e., by looking at processed
The radius Rregion defines the region for histogram calculation. A images provided by the algorithms. Numerical evaluation is
small value makes the information encoded in the histogram more usually done in terms of the number of false negatives (the number
local. If the shape information of moving objects is of interest, smaller of foreground pixels that were missed) and false positives (the
region size should be used. Choosing an LBP operator with a large number of background pixels that were marked as foreground).
value of P makes the histogram long and, thus, computing the The ground truth is achieved by manually labeling some frames
proximity becomes slow. Using a small number of neighbors makes from the video sequence.
the histogram shorter but also means losing more information. In [1], we used both visual and numerical methods to evaluate
Furthermore, the memory consumption of the method depends on our approach. We compared our method to four other methods
the choice of P . Since correlation between pixels decreases with presented in the literature. The test sequences included both
distance, much of the textural information can be obtained from local indoor and outdoor scenes. According to the test results, the
neighborhoods. Thus, the radius R of the LBP operator is usually overall performance of our method was better than the perfor-
kept small. The proximity threshold TP for histogram comparison is mance of the comparison methods for the test sequences used. The
easy to find by experimenting with different values. Values between method used block-wise processing. In certain applications, when
0.6 and 0.7 gave good results for all of our test sequences. accurate shape information is needed, this kind of approach cannot
660 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 4, APRIL 2006

TABLE 1
The Parameter Values of the Method for the Results in Figs. 2 and 3

be used. Making the decision pixel-wise usually offers more than the comparison method. In the case of false positives, our
accurate results for shape extraction. In Section 4, we extended our method was better in two cases. For the rest of the three sequences,
algorithm for pixel-wise processing and all the tests presented in the difference is very small. It should be noticed that, for the
this paper are carried out using this method. proposed method, most of the false positives occur on the contour
Fig. 2 shows some results for our method using some of the test areas of the moving objects (see Fig. 2). This is because the features
sequences from [1]. The first two frames on the upper left are from are extracted from the pixel neighborhood. According to the
an indoor sequence where a person is walking in a laboratory room. overall results, the proposed method outperforms the comparison
Background subtraction methods that rely only on color information method for the used test sequences.
will most probably fail to detect the moving object correctly because Since our method has relatively many parameters, there
of the similar color of the foreground and the background. The next naturally arises a question: How easy or difficult is it to obtain a
two frames are from an indoor sequence where a person walks
good set of parameter values? To see how sensitive the proposed
toward the camera. Many adaptive pixel-based methods output a
method is to small changes of its parameter values, we calculated the
huge amount of false negatives on the inner areas of the moving
error classifications for different parameter settings. Because of a
object because the pixel values stay almost the same over time. The
huge amount of different combinations, only one parameter was
proposed method gives good results because it exploits information
varied at a time. The measurements were made for several video
gathered over a larger area than a single pixel. The first two frames
on the lower left are from an outdoor sequence which contains sequences, including indoor and outdoor scenes. The results for the
relatively small moving objects. The original sequence has been first sequence of Fig. 2 are plotted in Fig. 5. It can be clearly seen that,
taken from the PETS database (ftp://pets.rdg.ac.uk). The proposed for all parameters, a good value can be chosen across a wide range of
method successfully handles this situation and all the moving values. The same observation was made for all the measured
objects are detected correctly. The last two frames are from an sequences. This property significantly eases the selection of
outdoor sequence that contains heavily swaying trees and rippling parameter values. Furthermore, the experiments have shown that
water. This is a very difficult scene from the background modeling
point of view. Since the method was designed to handle also
multimodal backgrounds, it manages the situation relatively well.
The values for the method parameters are given in Table 1. For the
first three sequences, the values were kept untouched. For the last
sequence, values for the parameters K and TB were changed to
adjust the method for increased multimodality of the background.
In [13], a test set for evaluating background subtraction methods
was presented. It consists of seven video sequences, each addressing
a specific canonical background subtraction problem. In the same
paper, 10 different methods were compared using the test set. We
tested our method against this test set and achieved the results
shown in Fig. 3. When compared to the results in [13], the overall
performance of our method seems to be better than that of the other
methods. We did not change the parameter values of our method
between the test sequences, although better results could be
obtained by customizing the values for each sequence. See Table 1
for the parameter values used. Like most of the other methods, our
method was not capable of handling the Light Switch problem. This
is because we do not utilize any higher level processing that could be
used to detect sudden changes in background.
We also compared the performance of our method to the widely
used method of Stauffer and Grimson [7] by using the five test
sequences presented in [1]. The sequences include both indoor and
outdoor scenes and five frames from the each sequence are labeled
as the ground truth. The results are shown in Fig. 4. The numbers
of error classifications were achieved by summing the errors from
the five processed frames corresponding to the ground truth Fig. 3. Detection results of our method for the test sequences presented in [13].
frames. For all five sequences, our method gave less false negatives The image resolution is 160  120 pixels.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 4, APRIL 2006 661

Fig. 4. Comparison results of the method presented in this paper (TBMOD) and the method of Stauffer and Grimson [7] (GMM). (a) The test sequences. (b) The test
results. FN and FP stand for false negatives and false positives, respectively.

Fig. 5. Number of false negatives (FN) and false positives (FP) for different parameter values for the first sequence of Fig. 2. While one parameter was varied, other
x ¼ ð~
parameters were kept fixed at the values given in Table 1. The results are normalized between zero and one: ~ x  minð~
xÞÞ=maxð~
x  minð~
xÞÞ.
662 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 4, APRIL 2006

a good set of parameters for a sequence usually performs well also REFERENCES
for other sequences (see Table 1). [1] M. Heikkilä, M. Pietikäinen, and J. Heikkilä, “A Texture-Based Method for
We also measured the speed of the proposed method. For the Detecting Moving Objects,” Proc. British Machine Vision Conf., vol. 1, pp. 187-
196, 2004.
parameter values used in the tests, a frame rate of 15 fps was [2] T. Ojala, M. Pietikäinen, and D. Harwood, “A Comparative Study of
achieved. We used a standard PC with a 1.8 GHz processor and Texture Measures with Classification Based on Feature Distributions,”
Pattern Recognition, vol. 29, no. 1, pp. 51-59, 1996.
512 MB of memory in our experiments. The image resolution was
[3] T. Ojala, M. Pietikäinen, and T. Mäenpää, “Multiresolution Gray-Scale and
160  120 pixels. This makes the method well-suited to systems Rotation Invariant Texture Classification with Local Binary Patterns,” IEEE
that require real-time processing. Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987,
July 2002.
[4] A. Prati, I. Mikic, M.M. Trivedi, and R. Cucchiara, “Detecting Moving
Shadows: Algorithms and Evaluation,” IEEE Trans. Pattern Analysis and
6 CONCLUSIONS Machine Intelligence, vol. 25, no. 7, pp. 918-923, July 2003.
[5] C.R. Wren, A. Azarbayejani, T. Darrell, and A.P. Pentland, “Pfinder: Real-
A novel approach to background subtraction was presented, in Time Tracking of the Human Body,” IEEE Trans. Pattern Analysis and
which the background is modeled using texture features. The Machine Intelligence, vol. 19, no. 7, pp. 780-785, July 1997.
[6] N. Friedman and S. Russell, “Image Segmentation in Video Sequences: A
features are extracted by using the modified local binary pattern Probabilistic Approach,” Proc. Conf. Uncertainty in Artificial Intelligence,
(LBP) operator. Our approach provides us with several advantages pp. 175-181, 1997.
compared to other methods. Due to the invariance of the [7] C. Stauffer and W.E.L. Grimson, “Adaptive Background Mixture Models
for Real-Time Tracking,” Proc. IEEE CS Conf. Computer Vision and Pattern
LBP features with respect to monotonic gray-scale changes, our Recognition, vol. 2, pp. 246-252, 1999.
method can tolerate considerable illumination variations common [8] P. KaewTraKulPong and R. Bowden, “An Improved Adaptive Background
Mixture Model for Real-Time Tracking with Shadow Detection,” Proc.
in natural scenes. Unlike many other approaches, the proposed European Workshop Advanced Video Based Surveillance Systems, 2001.
features are very fast to compute, which is an important property [9] Z. Zivkovic, “Improved Adaptive Gaussian Mixture Model for Background
from the practical implementation point of view. The proposed Subtraction,” Proc. Int’l Conf. Pattern Recognition, vol. 2, pp. 28-31, 2004.
[10] Q. Zang and R. Klette, “Robust Background Subtraction and Maintenance,”
method belongs to nonparametric methods, which means that no Proc. Int’l Conf. Pattern Recognition, vol. 2, pp. 90-93, 2004.
assumptions about the underlying distributions are needed. [11] A. Elgammal, R. Duraiswami, D. Harwood, and L.S. Davis, “Background
and Foreground Modeling Using Nonparametric Kernel Density Estima-
Our method has been evaluated against several video sequences tion for Visual Surveillance,” Proc. IEEE, vol. 90, no. 7, pp. 1151-1163, 2002.
including both indoor and outdoor scenes. It has proven to be [12] K. Kim, T.H. Chalidabhongse, D. Harwood, and L. Davis, “Background
tolerant to illumination variations, the multimodality of the back- Modeling and Subtraction by Codebook Construction,” Proc. IEEE Int’l
Conf. Image Processing, vol. 5, pp. 3061-3064, 2004.
ground, and the introduction/removal of background objects. [13] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: Principles
Furthermore, the method is capable of real-time processing. and Practice of Background Maintenance,” Proc. IEEE Int’l Conf. Computer
Vision, vol. 1, pp. 255-261, 1999.
Comparisons to other approaches presented in the literature have [14] A. Monnet, A. Mittal, N. Paragios, and R. Visvanathan, “Background
shown that our approach is very powerful when compared to the Modeling and Subtraction of Dynamic Scenes,” Proc. IEEE Int’l Conf.
state-of-the-art. Computer Vision, vol. 2, pp. 1305-1312, 2003.
[15] J. Kato, T. Watanabe, S. Joga, J. Rittscher, and A. Blake, “An HMM-Based
Currently, the method requires a nonmoving camera, which Segmentation Method for Traffic Monitoring Movies,” IEEE Trans. Pattern
restricts its usage in certain applications. We plan to extend the Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1291-1296, Sept. 2002.
[16] M. Mason and Z. Duric, “Using Histograms to Detect and Track Objects in
method to support also moving cameras. The preliminary results Color Video,” Proc. Applied Imagery Pattern Recognition Workshop, pp. 154-
with a pan-tilt-zoom camera are promising. The proposed method 159, 2001.
also has relatively many parameters. This could be a weakness, but [17] S. Jabri, Z. Duric, H. Wechsler, and A. Rosenfeld, “Detection and Location
of People in Video Images Using Adaptive Fusion of Color and Edge
at the same time, it allows the user extensive control over method Information,” Proc. Int’l Conf. Pattern Recognition, vol. 4, pp. 627-630, 2000.
behavior. A proper set of parameters can be easily found for a [18] L. Wixson, “Detecting Salient Motion by Accumulating Directionally-
Consistent Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence,
given application scenario. vol. 22, no. 8, pp. 774-780, Aug. 2000.
[19] T. Matsuyama, T. Ohya, and H. Habe, “Background Subtraction for Non-
Stationary Scenes,” Proc. Asian Conf. Computer Vision, pp. 622-667, 2000.
ACKNOWLEDGMENTS
This work was supported by the Academy of Finland. The authors
. For more information on this or any other computing topic, please visit our
also want to thank Professor Janne Heikkilä for his contribution. Digital Library at www.computer.org/publications/dlib.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy