Pfister Thesis Full
Pfister Thesis Full
Pfister Thesis Full
Thesis by
Samuel T. Pfister
2006
(Defended April 14, 2006)
ii
c
2006
Samuel T. Pfister
All Rights Reserved
iii
Acknowledgements
I wish to extend warm thanks to my adviser, Joel Burdick, for his constant support. Many
thanks to Stergios Roumeliotis for early guidance in all subjects central to this work. Thanks
to Kristo Kriechbaum and to the rest of the robotics group for their valuable assistance and
feedback. Thanks to the Mars Exploration Rover team and the DARPA Grand Challenge
team for such wonderful and unique experiences outside of research.
I’d like to thank my family and friends for their love and support, and Lucy and Praline
for always being there for me. Finally, loving thanks to my wife, Heidi, for making sure this
work didn’t get the best of me.
iv
Abstract
Contents
Acknowledgements iii
Abstract iv
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Summary of Contributions and Related Work . . . . . . . . . . . . . . . . . 4
2 Background 9
2.1 Chi-Square Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 A Simple ML Example . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Hough Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5.1 Odometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5.2 Range Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Bibliography 157
List of Figures
3.1 Geometry of the range sensing process. The robot acquires dense range scans
in poses i and j. The circles represent robot position, while the x-y axes denote
the robot’s body-fixed reference frames. . . . . . . . . . . . . . . . . . . . . . 23
3.2 Representation of the uncertainty of selected range scan points. . . . . . . . . 25
3.3 A) Experiments with initial displacement perturbations between scans taken
at a single pose. B) Close-up of robot pose with results. . . . . . . . . . . . . 43
3.4 A) Experiments with initial displacement perturbations between scans taken
at different poses. B) Close-up of pose 2 with results. . . . . . . . . . . . . . 45
3.5 A) Experiments with initial displacement perturbations in a non-static envi-
ronment. B) Close-up of pose 2 with results. . . . . . . . . . . . . . . . . . . 46
3.6 A) Experiments with initial displacement perturbations in a hallway environ-
ment. B) Close-up of pose 2 with results. . . . . . . . . . . . . . . . . . . . . 47
3.7 A) A 109-pose, 32.8-meter loop path. B) Close-Up of final path poses, shown
the covariance estimates of the weighted and unweighted algorithms. . . . . . 52
3.8 A) An 83-pose, 24.2-meter loop path. B) Close-up of final loop poses. . . . . 53
4.1 Example of line segment fit: data points (left) and fitted line with a represen-
tation of its uncertainty (right). . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Infinite polar line L representation. . . . . . . . . . . . . . . . . . . . . . . . 59
xi
4.3 Line coordinates with respect to a global frame and a frame at pose i. . . . . 61
4.4 Line coordinates with respect to a global frame and a frame at pose i. The
values δρi and δψi represent the pose i displacement in the “ρ − ψ” frame. . 61
4.5 Projection of pose uncertainty into line L. . . . . . . . . . . . . . . . . . . . . 63
4.6 Infinite line and line covariance representation. ~u P represents the center of
rotational uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.7 Segment S representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.8 Segment S representation with multiple endpoint pairs. . . . . . . . . . . . . 66
4.9 Segment S representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.10 Segment S and endpoint covariance representation. . . . . . . . . . . . . . . 72
4.11 Segment S and endpoint covariance representation with small σ α . . . . . . . 73
4.12 A) Raw range scan points. B) Extracted line segment features. . . . . . . . 74
4.13 A) Hough space accumulator. B) Extracted infinite line. . . . . . . . . . . . 76
4.14 A) Grouped set of collinear points. B) Optimally fit line. . . . . . . . . . . . 77
4.15 A) Hough space accumulator. B) Extracted infinite line. . . . . . . . . . . . . 83
4.16 A) Grouped set of collinear points. B) Optimally fit line. . . . . . . . . . . . 83
4.17 Range data: A) Raw points and selected point covariances. B) Fit lines and
line uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.18 Range data from two poses: A) Raw points and selected point covariances. B)
Fit lines and line covariances. C) Merged lines and line covariances. . . . . . 95
4.19 Range data from eight poses – A) Raw points and selected point covariances.
B) Fit lines and line covariances. C) Merged lines and line covariances. . . . 96
4.20 A) Raw points and selected point covariances. B) Fit lines and line covariances.
C) Merged lines and line covariances. . . . . . . . . . . . . . . . . . . . . . . 97
4.21 Line map built with presented SLAM techniques: A) Full raw point data. B)
Full line segment map representation. . . . . . . . . . . . . . . . . . . . . . . 106
4.22 Full line map built with line segment length restriction. . . . . . . . . . . . . 107
4.23 Hallway data set: A) Raw data points. B) Full set of fit lines. C) Restricted
set of fit lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.1 Multi-scale range scan representation: A) scale tree graph. B–E) A sequence
of increasingly fine scale representations of the data. . . . . . . . . . . . . . . 110
xii
5.2 Block B representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.3 Block B and block covariance representation. . . . . . . . . . . . . . . . . . 114
5.4 Block B representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.5 Block B with subsegment and subpoint features shown. . . . . . . . . . . . 119
5.6 Multi-scale extraction of ρa ,ρb : A) Raw scan points B) Hough transform. . . 122
5.7 Fine-scale extraction of ρa ,ρb : A) Block ρ boundary detection at fine scale. B)
Detected infinite block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.8 Coarse-scale extraction of ρa ,ρb : A) Block ρ boundary detection at coarse
scale. B) Detected infinite block . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.9 End extraction at the fine scale . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.10 Subsequent block extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.11 Example of scale-based difference in the block feature representation of iden-
tical point data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.12 Multi-scale range scan representation: A) Scale tree for pose 1. B) Finest scale
features pose 1. B) Scale tree for pose 2. D) Finest scale features for pose 2.
E) Corresponding features from pose 1, pose 2. . . . . . . . . . . . . . . . . . 139
5.13 Multi-scale localization example where the blue circle is pose 1, the red circle
is the estimated pose 2, and the black circle is the actual pose 2. A) Initial
pose estimates and raw scans. B) Coarse feature fit. C–G) Intermediate pose
estimates and feature correspondences at each scale. . . . . . . . . . . . . . . 141
5.14 Kidnapped robot problem data: Multi-scale map and candidate scan repre-
sentation at scales of 200 mm, 100 mm, and 50 mm. . . . . . . . . . . . . . . 148
5.15 Kidnapped robot problem data: Multi-scale map representation and candidate
scan representation at scales of 25 mm and 12.5 mm. . . . . . . . . . . . . . 149
5.16 A selection of four hypotheses invalidated at the coarsest scale. . . . . . . . . 150
5.17 A selection of two hypotheses with partial validation at the coarsest scales but
invalidated at the 50 mm scale. . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.18 A solution to the kidnapped robot problem with validated hypotheses at all
scales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
xiii
List of Tables
Chapter 1
Introduction
1.1 Motivation
Autonomous robot navigation has long been a goal of researchers for applications rang-
ing from military supply convoys, to space exploration, to autonomous highway driving.
A critical requirement of higher level navigation applications is that the robot has some
reasonable knowledge of its current position with respect to a fixed reference frame. For
example, in navigation applications that entail motion to a target position, the robot needs
an accurate estimate of its current position to plan a path to the goal and to confirm suc-
cess. Similarly, for exploration applications, position information can be used and recorded
to avoid redundant coverage. The process of position estimation with respect to a fixed
reference frame is defined as the localization of the robot.
A mobile robot can localize itself using two different classes of on-board sensors: pro-
prioceptive sensors and exteroceptive sensors. Proprioceptive sensors, such as encoders or
inertial measurement units (IMUs), measure the motion of the robot, acquiring data that
can be integrated to estimate relative robot displacement. This method of localization is
called odometry, or dead reckoning, and when used alone, the integrated error in global
position grows without bound over time.
Exteroceptive sensors, such as laser range scanners or cameras, take measurements from
the external environment. This data can be correlated at subsequent robot positions to
compute relative pose or displacement estimates, which can improve and sometimes replace
odometry. Externally sensed data may also be correlated with data from a global map,
giving a global position measurement and bounding the overall position error. If a global
map is not initially available, it is possible for the robot to build a global map with the
2
externally sensed data, while using this map to localize. This approach is commonly called
simultaneous localization and mapping (SLAM).
This dissertation focuses on localization and mapping using exteroceptive sensors, so
the rest of the material is prefaced with a discussion of the several methods used to localize
a robot with externally acquired data.
One method of exteroceptive sensor–based localization involves modifying the environ-
ment through the placement of easily detectable and identifiable passive or active beacons
in known locations, which can then be used as references to triangulate the robot’s position
[MM92]. Global positioning system (GPS) based localization falls into this category, as the
known global reference map consists of the instantaneous positions of the satellites, and the
on-board exteroceptive sensor is the GPS receiver. The simplicity and improving accuracy
of GPS-based localization makes it popular for outdoor autonomous applications, but for
indoor applications, near tall buildings or high terrain, or in inclement weather, GPS sig-
nals can degrade significantly or drop out completely. For truly robust navigation, complete
reliance on GPS for localization, even in an outdoor environment, may not be adequate.
Another exteroceptive sensor-based localization method involves comparing the on-
board sensor data to a known map representing the geometry of the surrounding envi-
ronment. This map can be taken from a blueprint, satellite photos, or other previously
built map [BEFW97]. This method is more flexible but more complex than the artificial
beacon method, as the correlation of sensed data with the known global map requires a
non-trivial solution to the data association problem, while the beacon methods often ben-
efit from uniquely identifiable features by design. Though this localization method does
not require preconditioning of the environment through beacon placement, it is constrained
by the need for prior knowledge of the environment and is therefore not suitable for many
applications.
Alternatively, it is possible to develop a method which localizes the robot without any
prior knowledge or conditioning of the environment. In this case the map is created as
the robot navigates through the environment. This method is the most flexible in that it
requires no prior knowledge or restructuring of the environment, but it is also the most
difficult to implement efficiently and robustly. This problem has been the primary focus
of localization and mapping research in the last 20 years and has resulted in a range of
methods with different map representations and estimation schemes [Thr02].
3
One early mapping method represents the map as an occupancy grid [Elf89]. As obsta-
cles are detected, the corresponding cell in the rasterized map is incremented to create a
gridded representation of the environment. In contrast, early work by Chatila and Laumond
[CL85] introduces a feature-based representation using polyhedra to describe the environ-
ment boundaries. Feature-based map representations are often more complex to construct
than grid-based maps, as the features need to be extracted from the raw data. Yet these
maps are not constrained in their precision like the grid maps, whose precision is limited
by choice of cell size. This means that a sparse feature map can hold a much more efficient
and accurate representation of the environment than a grid-based counterpart.
Of course, the data representation underlying these feature-based mapping methods can
vary drastically, and may depend on the type of external sensor or sensors being used. Some
algorithms use cameras as the primary sensor and use features extracted from camera frames
as the basis for mapping and localization [AF89, AH93]. Sonar range sensors [Cro89] are
commonly used, as are laser radar scanners or ladars [ABL + 01]. A variety of feature types
have been developed for extraction from laser range scanner data. These include corners
[AMTS04], lines [CT99, AD04], principal components [VK99], or even the raw range data
points themselves [LM97b]. Multi-sensory methods also exist which merge and combine
information from both range and camera sensors [NTHS99, NW00].
The most successful schemes to estimate map coordinates and robot position have in-
volved probabilistic techniques. These schemes implement a SLAM approach where data is
collected from an uncertain position and assembled into a map while using that uncertain
map to assist in localization. Early work by Smith and Cheeseman [SC86] introduced a
probabilistic framework for map building and localization. One current SLAM technique
uses an expectation maximization (EM) algorithm to build the map and localize the robot
[TFB98]. This algorithm can be used to focus on robust determination of feature correspon-
dences [DSTT01]. Another common approach to SLAM uses Kalman filters to estimate the
robot position and build the map [RB00a, CMNT99, MDWD02, DDWB00].
There are three primary challenges common to localization and mapping methods, which
are the focus of this dissertation:
Problem 1) Sensor noise compensation: Without sensor noise, dead reckoning
would be a sufficient localization method. Unfortunately, small errors in dead reckoning
integrate and cause the position error to grow over time. Though external sensors can be
4
used to help bound this error, these sensors also provide noisy measurements, which must
be addressed.
Problem 2) Data association accuracy: For methods that localize the robot using
external sensor information, it is necessary to establish correspondences between data col-
lected across different robot positions. If accurate correspondences cannot be determined,
then the sensor information is useless for localization. The details of the data association
problem differ between the approaches mentioned above, but it is essential for accurate
and robust localization and mapping. Data association is made especially difficult in the
presence of the sensor noise discussed above.
Problem 3) Robustness to unmodeled errors: In any real-world application there
will be events or sensory readings that are outliers when compared to normal operations. For
example, wheel slippage due to difficult terrain can cause unmodeled odometry error when
integrating wheel rotation. Also a closed door or moved table can introduce discrepancies in
external sensor measurements, which are far larger than can be explained by sensor process
noise. It is critical that localization methods be robust to, and recover from, these types of
unmodeled errors.
The following section summarizes the localization and mapping methods I have de-
veloped, and highlights how these methods contribute to the current state of research in
robotics.
Chapter 2
Background
Several common methods used repeatedly in this work are introduced here. First the basics
of the chi-square hypothesis are reviewed, which is a critical component of my feature
correspondence methods. The maximum likelihood framework for parameter estimation is
then introduced, as well as the equations for the extended Kalman filter. Finally the Hough
transform based pattern detection method is described, which is the basis of the feature
extraction approaches used in Chapters 4 and 5.
The chi-square test is a common method used to test the goodness of fit of a hypothetical
assumption. This test consists of the computation of a chi-square distributed random
variable D, corresponding to the measurement and model being tested. The value is then
compared with chi-square distribution probability tables to determine the probability that
the error seen would be generated by the given model. A threshold can be set such that
measurements that are determined to have a low probability of being generated by the
model result in the dismissal of the model hypothesis as false.
Given an n-dimensional random variable V and a measurement of this variable V̂ , the
assumption to be validated is that V has a zero mean, normal distribution with a covariance
represented by an n × n covariance matrix P V . The D 2 value can be calculated as follows:
Note that with the assumptions of a normal distribution on the random variables, the
10
value of D is equivalent to the Mahalanobis distance, which is a unitless distance metric
between random variables that is weighted by the relative uncertainty.
The threshold can then be set to a value χ 2 (P, n), determined by a chi-square distribu-
tion table, where n is the degrees of freedom and P is the probability that the given value V̂
of V could be generated by the model which has been assumed to be true. In my methods,
I set the threshold for a very low probability P as I don’t want to throw away a hypothesis
that has any reasonable chance of being true. The hypothesis can be invalidated if
It is important to note that while a high value of D 2 does imply a low probability of model
accuracy, the converse is not necessarily true. Even in the case where a single measurement
exactly matches the model such that D 2 = 0, one can only claim with any degree of
certainty that there is not a significant reason to abandon the model. The test alone does
not imply that there is a high probability of the model being accurate. The chi-square test
is an effective method of eliminating untrue hypotheses, but model validation methods may
require an additional, alternate test to detect and reduce instances of wrongly accepting a
false hypothesis.
The maximum likelihood (ML) method is a framework for parameter estimation given a set
of independent measurements of a random variable. In this method, a likelihood function is
defined as the product of the likelihood functions of each of n independent measurements.
Given a set of measurements {x̂1 , ..., x̂n } and the parameters to be estimated g = g 1 , ..., gm
the likelihood function for each measurement can be represented as f (x̂ i , g). The likelihood
function is then
L({x̂k }|g) = f (x̂1 , g)f (x̂2 , g) · · · f (x̂n , g). (2.3)
The goal is to determine the value of g that maximizes the likelihood function L. To do this
L is differentiated with respect to g and arrive at the following conditions for maximizing
the likelihood function:
∂L ∂L
= 0, · · · , = 0. (2.4)
∂g1 ∂gm
11
It is often beneficial to consider the log of the likelihood function when determining these
partial derivatives as the likelihood and log-likelihood have the same extrema.
∂ ln(L) ∂ ln(L)
= 0, · · · , = 0. (2.5)
∂g1 ∂gm
Then set the log of the likelihood function equal to zero and solve the partial derivative
with respect to x:
∂ ln(L({x̂k }|x)) ∂M
= =0 (2.8)
∂x ∂x
Xn
x̂i − x
=0 (2.9)
σ
k=1
n
1X
x= x̂i . (2.10)
n
k=1
The result is simply that the sample average of the measured values has the maximum
likelihood of being the true value of x. In this work, the same approach is used to address
more complex problems of parameter estimation.
The Kalman filter (KF) is a common tool used in localization and mapping procedures
for mobile robots [CMNT99, DDWB00]. Specifically the Kalman filter allows a method of
estimating the state of a robot, which can include both the robot’s position and orienta-
tion (or pose) as well as the positions of the sensory data of the surrounding environment
collected by the robot. The filter equations effectively bookkeep the uncertainties of the
12
state variables and, with each update, the overall mean squared error is minimized. The
KF enables a simultaneous localization and mapping (SLAM) approach where the robot
builds the map while it uses this dynamically built map to localize.
The fundamental modes of the KF are propagation and update. Propagation occurs
as the robot moves with dead reckoning and the pose uncertainty increases according to
the propagation process noise. Updates occur when new measurements are made of the
environment and this information is incorporated into the filter resulting in a reduction of
overall uncertainty.
The primary assumptions for KF optimality are that the system is linear and all random
processes have a normal distribution. Due to the nonlinearities introduced by the coupling
between the orientation and position measurements, it is common to utilize the extended
Kalman filter (EKF) in SLAM implementations, which linearizes the KF equations but
relaxes the assumptions of optimality.
In discrete time the evolution of the state vector X is
where f is a nonlinear function of the state X, the control inputs u and the process noise
w. Also consider a measurement z at time t k to be governed by
where h is a nonlinear function of the state and measurement process noise v. The resulting
filter equations for the propagation step are
The estimate X̂k|k−1 represents the estimate of state X at step k given all sensed inputs up
to step k − 1, while the values of X̂k−1|k−1 and Pk−1|k−1 represent the state and covariance
estimates, respectively, at time step k − 1. As the actual value of w may not be known,
the propagation of the state is equivalent to the state evolution equation, Eq. (2.11), with
13
zero process noise. Pk|k−1 is the covariance matrix for the state estimate where A k and
Wk are Jacobian matrices of partial derivatives of f with respect to X and w. The update
equations for the EKF are
−1
Kk = Pk|k−1 HkT Hk Pk|k−1 HkT + Vk Rk VkT , (2.15)
where Hk and Vk are the Jacobian matrices of partial derivatives of h with respect to X
and v calculated at each step k. The state and covariance is represented as X̂k|k and Pk|k ,
respectively, and I is the identity matrix. See [WB95] for an overview of the derivation of
the general EKF equations.
The Hough transform is a general pattern detection technique commonly used in machine
vision applications [DH72]. In this process, each of a set of scan points {~u k }, k = 1, ..., n
is transformed into a discretized curve in the Hough space and accumulated in a discrete
rasterized space. This is a voting scheme and the resultant peaks in the Hough space corre-
spond to patterns that “agree” with many points. Though the method can be generalized
to a wide range of patterns, the focus here is on the implementation as a line detector.
Consider an infinite polar line L with a normal distance to the origin, ρ, and a normal
angle, α, as shown in Figure 2.1. These parameters ρ and α define the dimensions of the
Hough space. Each point in the set of scan points is represented in Cartesian form with the
k th point ~uk = (xk , yk ).
Define a Hough space H(i, j) as a two-dimensional raster with integer indices i and j and
define the discrete values in each dimension as α(j) and ρ(i), respectively. α(j) is discretized
in increments of Dα on the range [−π/2, π/2] and ρ(i) is discretized in increments of D ρ
on the range [−ρmax , ρmax ] where ρmax is the maximum range value in the point set. The
{i, j} cell of the discretized Hough space therefore represents the range of line coordinates
[ρ(i) ± Dρ /2, α(j) ± Dα /2]. Initially the value at each cell in the Hough space is zero. For
14
L
ρ
α
a given point, for all i, the position ρ ik of the line at angle α(i) can be calculated such that
the line would pass through point ~u k :
From the value of ρik , determine the index j such that ρ(j) − D ρ /2 < ρik ≤ ρ(j) + Dρ /2.
Then increment the value at Hough space cell H(i, j). After multiple point inputs, the cell
in Hough space with the highest incremented value corresponds to the line that has the
most contributing points.
2
2500
1.5
2000
L2 L1
L3 1
1500 L1
0.5
α (radians)
L2
Y (mm)
1000 v1 0
L3
−0.5
500
−1
0
−1.5 v1
−500
−2
−500 0 500 1000 1500 2000 2500 −2000 −1500 −1000 −500 0 500 1000 1500 2000 2500
X (mm) R (mm)
Figure 2.2: Line representations through a single point in Cartesian space and Hough space.
Figure 2.2 shows the Hough transform of a single point v 1 . On the left, the point and
three lines that intersect that point are shown. On the right, the curve representing the
point in Hough space is shown. Also plotted are the three lines that are mapped to points
on the curve in Hough space. Figure 2.3 on the left shows two points and a single common
line intersecting both. The Hough space on the right shows the curve generated by each
15
2
1.5
L1
1500
1
0.5 L1
1000 v1
α (radians)
Y (mm)
500 −0.5
−1
0 v2
−1.5 v1 v2
−2
−500 0 500 1000 1500 2000 −2000 −1500 −1000 −500 0 500 1000 1500 2000 2500
X (mm) R (mm)
Figure 2.3: Line representation through two points in Cartesian space and Hough space.
point and the line determined by the peak of the Hough space at the intersection of the
curves.
In Section 4.3.1 the Hough transform is used to group collinear points and to extract
the initial guess for the line feature estimation method. In Section 5.3.1 I extend the basic
Hough transform to develop a multi-scale feature extraction method.
2.5 Sensors
My motive is to achieve effective mobile robot localization using onboard sensors. There are
two basic categories of sensors: proprioceptive and exteroceptive. Proprioceptive sensors
(such as encoders or IMUs) sense the motion of a robot and are the basis for odometry
or dead reckoning. Exteroceptive sensors (such as laser scanners or cameras) sense the
surrounding environment. In this work odometry is used as a proprioceptive sensor, and
a dense laser range scan is used as an exteroceptive sensor. In this section I outline the
geometric representation of the sensory signal, and the sensor signal uncertainty models
that will be referenced in subsequent chapters.
2.5.1 Odometry
Pose gj yi j
xi j
Pose gi φi
yi
g0 xi
where xi , yi , and φi are described in Figure 2.4. In this representation the robot heading is
denoted to be at angle φi such that the forward direction on the robot corresponds to the
positive x direction.
The odometry system generally estimates relative changes in the robot pose by inte-
grating over the internally measured motion of the actuators or the signals of an inertial
measurement unit (IMU). If the robot starts at pose i and moves to pose j the resulting
local displacement measurement with respect to pose i is denoted g ij where
xij
gij = gi−1 gj = yij (2.20)
φij
as seen in Figure 2.4. The value for g ij is a random variable sampled from the robot
odometry system by integrating wheel velocity or from other methods of dead reckoning.
It is common to assume that the actuator or IMU noise is Gaussian, and with linear equa-
tions, gij will follow a Gaussian distribution. In any case, the covariance matrix of g ij can
17
be defined as
Pxx Pxy Pxφ
Pgij = Pyx Pyy Pyφ . (2.21)
Pφx Pφy Pφφ
This is a general covariance matrix. The form of the actual covariance matrix depends
on the model of the odometry method being used. A simple, often used, model assumes the
noise in x, y, and φ is independent for a small displacement g ij , in which case Pgij simplifies
to
σx2 0 0
Pgij = 0 σy2 0 , (2.22)
0 0 σφ2
where the values σx2 , σy2 and σφ2 represent the variance in x, y and φ, respectively. As this
is represented in the robot local frame, the uncertainty in x corresponds roughly to the
uncertainty in the distance moved straight ahead while the uncertainty in y corresponds
to the possible side slip of the robot. In practice, over longer distances of integrating the
displacement, even small errors in odometry orientation estimation modeled by σ φ2 tend to
have a significant effect on the overall uncertainty of the robot pose due to the integration
of lever arm effects.
Given this representation for a local displacement and displacement covariance, the pose
transformations that are commonly used in localization and mapping methods are briefly
introduced.
Pose Transformations
Given an initial pose gi in the global frame and a measured displacement g ij in a frame
local to gi , the current pose estimate gj can be calculated in the global frame as follows:
xj xi cos(φi ) − sin(φi ) 0 x
ij
gj = yj = yi + sin(φi ) cos(φi ) 0 yij . (2.23)
φj φi 0 0 1 φij
Similarly, given two poses in the global frame g i and gj , the relative displacement be-
18
tween them gij measured with respect to pose gi can be calculated as follows:
xij cos(−φi ) − sin(−φi ) 0 xi − x j
gij = yij = sin(−φi ) cos(−φi ) 0 yi − y j . (2.24)
φij 0 0 1 φi − φ j
Given an initial pose gi and covariance of that pose measurement P gi and a local displace-
ment measurement gij with local covariance Pgij , the combined covariance Pgj in the global
frame can be calculated as
Pgj = QPgi QT + KPgij K T , (2.25)
where
cos(φi ) − sin(φi ) 0
K = sin(φi ) cos(φi ) 0 (2.26)
0 0 1
and
1 0 −y cos(φi ) − x sin(φi )
Q = 0 1 −y sin(φi ) + x cos(φi ) . (2.27)
0 0 1
j Wall
i uk+1
i uk uk+1 Xj
j u
uik−1uk−1 k X
X
Range
Measurements
β
β
dj
k
Pose i θki j
θk
Pose j
where dk is the measured distance to the environment’s boundary in the direction denoted
by θk (see Fig. 2.5).
Range sensors can be subject to both random noise effects and bias. For a discussion of
bias, see [PKRB02]. Here I briefly review a general model for measurement noise. Recall
the representation of scan data in terms of polar coordinates (d k , θk ) in Eq. (2.28). Let the
range measurement, dk , be comprised of the “true” range, d k , and an additive noise term,
εd :
dˆk = dk + εd . (2.29)
Also assume that error exists in the measurement of θ̂k , i.e., the actual scan angle differs
(slightly) from the reported or assumed angle. Thus,
θ̂k = θk + εθ , (2.30)
where θk is the “true” angle of the k th scan direction, and εθ is again an additive noise
ˆk can be represented as
term. Hence the measured point ~u
cos(θk + εθ )
ˆk = (dk + εd )
~u . (2.31)
sin(θk + εθ )
Zero bias assumption. I will first derive the scan point covariance with the assumption
of a negligible bias term in all random variable noise terms. In this case, the noise ε d is
20
assumed to be a zero-mean Gaussian random variable with variance σ d2 and the noise εθ is
assumed to be a zero-mean Gaussian random variable with variance σ θ2 (see, e.g., [AP96]
for justification). If I further assume that ε θ 1o (which is a good approximation for most
laser scanners), I can make the assumptions cos(ε θ ) = 1 and sin(εθ ) = εθ . The measured
ˆk can be thought of as consisting of the true component, ~u k , and the uncertain
scan point ~u
component, δ~uk :
ˆk = ~uk + δ~uk .
~u (2.32)
ˆk − ~uk yields
Expanding Eq. (2.31) and using the relationship δ~u k = ~u
− sin θk cos θk
δ~uk = (dk )εθ + εd . (2.33)
cos θk sin θk
Assuming that εθ and εd are independent, the covariance of the range measurement process
for the k th range point is
4
Puk = E[δ~uk (δ~uk )T ]
(dk ) σθ 2 sin2 θk
2 2 − sin 2θk σd22
2 cos θk sin 2θk (2.34)
= + .
2 − sin 2θ 2
2 cos θk 2 2
sin 2θk 2 sin θk
k
For practical computation, θ̂k and dˆk can be used as a good estimates for the quantities θ k
and dk .
The term δ~uk is generally well modeled by a zero-mean Gaussian noise process as outlined
above in Eq. (2.33). The bias ~bk is an unknown offset that can be approximated by a term
~ok corrupted by a zero-mean additive Gaussian noise δ~bk [AP96]. The covariance of this
noise component reflects the level of confidence in the value ~o k :
Chapter 3
yi β β j
Pose i θk
i
x θk y
i j
Pose j x
j
Figure 3.1: Geometry of the range sensing process. The robot acquires dense range scans
in poses i and j. The circles represent robot position, while the x-y axes denote the robot’s
body-fixed reference frames.
measures the range to the boundary of its nearby environment along rays that are separated
by a uniform1 angle, β (see Fig. 3.1). As described below, I allow for various uncertainties
in this range measurement.
Let the set of Cartesian coordinates of the n i scan points taken in the ith robot pose,
be denoted by {~uik }, k = 1, . . . , ni . The scan point coordinates are described in the robot’s
body fixed reference frame. Typically, the Cartesian coordinate of the scan point is derived
from range data according to the expression:
xik cos θki
~uik = = lki , (3.1)
yki sin θki
where lki is the measured distance to the environment’s boundary along the k th measuring
ray. The measuring ray is oriented in the direction denoted by θ ki , where θki is the angle
made by the k th measuring ray with respect to the x-axis of the body fixed reference frame
(see Fig. 3.1).
The main goal is to accurately estimate the robot’s displacement between poses by
matching range data obtained in sequential poses. This displacement estimate can be used
as the basis for a form of odometry, or fused with conventional odometry and/or inertial
1
The extension to non-uniform angle β is straightforward.
24
measurements to obtain better relative robot pose estimates. These estimates in turn can
support localization and mapping procedures. First, assume that the range scans at poses
i and j have a sufficient number of corresponding points to be successfully matched (see
Section 3.4). Let {~uik , ~ujk } for k = 1, . . . , nij be the set of corresponding matched scan point
pairs, where nij is the number of corresponding pairs. From these pairs I first want to
estimate the relative displacement between poses i and j: g ij = gi−1 gj = (Rij , pij ), where
cos φij − sin φij xij
Rij = ~ij =
p , (3.2)
sin φij cos φij yij
i.e., the displacement between poses i and j is described by a translation (x ij , yij ) and a
rotation, φij .
Next, I consider the covariance, P ij , of the displacement estimate. This covariance has
two uses. First, it reflects the quality of the displacement estimates. Large diagonal elements
of the covariance matrix indicate increased uncertainty. Any localization process should be
aware of the level of confidence in its computed pose estimates. Second, the covariance is
also needed when combining displacement estimates with measurements provided by other
sensors. More accurate and realistic estimates of the contributing covariances lead to more
accurate overall estimates in a sensor fusion algorithm, such as a Kalman filter.
My approach differs from prior work in that the contribution of each scan point to
the final displacement estimate is individually weighted according to that point’s specific
uncertainty. The scan point uncertainties are estimated using sensor measurement noise
models as well as models of specific geometric issues within the matching process itself. To
better understand these issues, examine Figs. 3.1 and 3.2. Fig. 3.1 depicts the situation
when a range sensor (e.g., a laser range finder) samples points on a nearby wall. The
boundary points sampled in pose i are indicated by circles, and labeled by ~u ik−1 , ~uik , and
~uik+1 . The nearby boundary points sampled in pose j are indicated by X’s and are labeled
by ~ujk−1 , ~ujk , and ~ujk+1 . Prior range matching methods (e.g., [GG97, VRB02, Cox91])
have made the simplifying assumption that the range scans of different poses sample the
environment’s boundary at exactly the same points—i.e., point ~u ik is assumed to be exactly
the same point as ~ujk , etc. This assumption is generally not true. In this chapter, I model
this correspondence error and incorporate this effect into the matching algorithm.
25
Scan Points
Selected Scan Points
Robot Pose
100 x Point Covariance (3σ)
As described in Sections 3.3.1 and 3.3.3, the range measurements are corrupted by
noise and possibly a bias term that is a function of the range sensing direction, θ ki , and
the sensor beam’s incidence angle, α ik (Fig. 3.1). Figure 3.2 shows the 95% confidence
level ellipses associated with the covariance estimates (calculated using the methods that
I will introduce later) of selected data points from an actual laser range scan. Clearly,
the wide variation in uncertainties seen in Fig. 3.2 strongly suggests that not all range
data points are of equal precision. Hence, this potentially large variability must be taken
into account in the estimation process. While the existence of these uncertainty sources
has previously been suggested [BB01, ABL + 01, Cox91, Ada00, AP96], my algorithm is the
first to explicitly model and account for their effects within the estimation process. Some
prior work has no explicit noise modeling (e.g., [GG97]), or apply a uniform uncertainty to
all contributing points. The most complete existing methods [BB01] and [LM97a] employ
statistical methods to calculate displacement estimate uncertainty. These methods do not
take sensor uncertainty models into account in the displacement estimation process and
use an unweighted assumption for the contributing points. Also [BB01] and [LM97a] do
not use any specific sensor noise characteristics as a basis for calculating uncertainty but
instead use a numerical sample of perturbations to extract an estimate of covariance. I
am able to demonstrate significant improvements over previous unweighted methods by
26
developing physically based uncertainty models for each individual point and incorporating
these models in both the displacement estimation process and the covariance calculation.
The principle behind this approach generally applies to any case of dense range data,
such as sonars, infrareds, cameras, radars, etc. The weighted matching formulation and its
solution given in Section 3.2 are independent of any sensor specifics. To use the general
results, specific models of sensor uncertainty are needed. Some detailed sensor models are
developed in Section 3.3. Since some of the assumptions underlying these sensor models are
best suited to laser range scanners, the application of the detailed sensor model formulas
is best suited to the use of laser scanners in indoor environments, though they can be
extended to structured outdoor environments. However, the general approach of Section
3.2 should work for other range sensors and other operating environments with reasonable
modifications to the sensor models.
This chapter is structured as follows: Section 3.2 describes a general weighted point
feature matching problem and its solution. Section 3.3 develops correspondence and range
measurement error models. Sections 3.4 and 3.5 summarize the point pairing selection and
sensor incidence angle estimation procedures. Experiments in Section 3.6 demonstrate the
algorithm’s accuracy, robustness, and convergence range. Direct comparisons with previous
methods (e.g., [LM97b, LM97a]) validate the effectiveness of the approach.
This section describes a general point feature matching problem and its solution.
Let the sets of Cartesian range scan data points acquired in poses i and j be denoted by
ˆi } and {~u
the {~u ˆj }, respectively. These measurements will be imperfect. Let {~u i } and {~uj }
k k k k
be the “true” Cartesian scan point locations. As discussed in Section 2.5.2 and Eq. (2.35),
range scan point measurements can generally be decomposed into the following terms:
introduced in Section 2.5.2 and discussed in more detail with regard to the weighted sensor
ˆi , ~u
matching problem in Section 3.3.3. Let (~u ˆj
k k ) be points that are deemed to correspond in
the range scans at poses i and j. As shown in Fig. 3.1, these points are not necessarily the
same physical point, but the closest corresponding points. Accounting for the fact that scan
data is measured in a robot-fixed frame, the error between the two corresponding points is
εij ˆi − Rij ~u
ˆj − pij
k =~
u k k (3.4)
for a given displacement (Rij , pij ) between poses. Substituting Eq. (3.3) into Eq.(3.4)
results in
εij uik − Rij ~ujk − pij ) + (δ~uik − Rij δ~ujk ) + (~bik − Rij~bjk ) .
k = (~ (3.5)
| {z } | {z } | {z }
(i) (ii) (iii)
A relative pose estimation algorithm aims to estimate the displacement g ij = (Rij , pij ) that
suitably minimizes Eq. (3.5) over the set of all correspondences. If the dense range scans
do sample the exact same boundary points, then ~u ik − Rij ~ujk − pij = 0 when Rij and pij
assume their proper values. However, ~u ik and ~ujk generally do not correspond to the same
boundary point. Hence, term (i) in Eq. (3.5) is the correspondence error, denoted by c ij
k:
due to the measurement process noise, and (iii) the measurement bias error.
For the sake of simplicity, I ignore the bias offsets for now (i.e., I assume that ~bik = ~bjk =0),
but consider their effect again in Section 3.3.3.
28
3.2.2 A General Covariance Model
For subsequent developments, a generalized expression must be derived for the covariance
of the measurement errors:
4
h i
Pkij = E εij
k (ε ij T
k ) (3.7)
h i
= ij
E (ck + δ~uik − Rij δ~ujk )(cij
k + δ~
u i
k − R ij δ~
u j T
k ) ,
where E[·] is the expectation operator, and I am ignoring bias effects for now. P kij captures
the uncertainty in the error between corresponding range point pairs. Because the range
measurement noise is assumed to be zero mean, Gaussian, and independent across measure-
ments, E[δ~uik (δ~ujk )T ] = E[δ~ujk (δ~uik )T ] = 0. Practically speaking, one would expect the range
measurement noise of the k th scan point in pose i to be uncorrelated to the measurement
noise of the k th corresponding range point in pose j. Hence, this is a fine assumption in
practice.
The correspondence error, cij
k , is generally a deterministic variable that is in turn a
function of the geometry of the robot’s surroundings. However, since I do not assume that
the geometry of the environment is known ahead of time, in this work I make a reason-
able probabilistic approximation to this term that accounts for the fact that the geometry
of the surroundings is a priori unknown. In this probabilistic approximating model, the
correspondence error and sensor measurement error terms are independent, and therefore
E[cij uik )T ] = E[cij
k (δ~ ujk )T ] = E[δ~uik (cij
k (δ~
T ujk (cij
k ) ] = E[δ~
T
k ) ] = 0. See Section 3.3.2 for a
4
h i h i
Pkij = E εijk (ε ij T
k ) = E c ij ij T
(c
k k ) + E δ~uik (δ~uik )T
h i
+ Rij E δ~ujk (δ~ujk )T Rij T
= C
Pkij + N Pki + Rij N Pkj Rij
T
(3.8)
= Qij ij T
k + Rij Sk Rij , (3.9)
29
where
C
Pkij = covariance associated with the approximating
terms of Pkij . As shown below, the correspondence errors depend upon the sensor beam’s
incidence angle. The noise covariances will also generally be a function of the variables θ ki ,
θkj , lki , and lkj . Thus, the covariance matrix Pkij would be expected to vary for each scan
point pair (see Figure 3.2 for an illustration). Hence, it is not suitable to assume, as in prior
work (e.g., [LM97a, LM97b]), that P kij is a constant matrix for all scan point pairs.
I employ a maximum likelihood (ML) framework to formulate a general strategy for estimat-
ing the robot’s displacement from a set of nonuniformly weighted point correspondences.
Let L({εij
k }|gij ) denote the likelihood function that captures the likelihood of obtaining the
the k = 1, . . . , nij range pair measurements are independent 2 and therefore the likelihood
can be written as a product:
L({εij ij ij ij
k }|gij ) = L(ε1 |gij )L(ε2 |gij ) · · · L(εnij |gij ). (3.10)
2
Possible dependencies of these measurements will be briefly considered in Section 3.3.2. Generally, the
only effect that will lead to dependence is possible couplings in the correspondence error that arise if the
geometry of the environment is a priori known.
30
Recall that the measurement noise is considered to be a zero-mean Gaussian process. Fi-
nally, as it is shown in Section 3.3.2, the correspondence noise can be approximated by a
zero-mean Gaussian process. Neglecting the bias offset for the moment (see Section 3.3.3),
the above assumptions imply that L({ε ij
k }|gij ) takes the form:
L({εij
k }|gij ) = q = , (3.11)
D ij
k=1 2π det Pkij
nij
ij 1 X ij T ij −1 ij
where M = (εk ) (Pk ) εk , (3.12)
2
k=1
nij
Y q
D ij = 2π det Pkij . (3.13)
k=1
The optimal displacement estimate is the one that maximizes the value of L({ε ij
k }|gij ) with
ln[L({εij ij ij
k }|gij )] = −M − ln(D ) (3.14)
and from the numerical point of view, it is often preferable to work with the log-likelihood
function.
Before discussing the solution to this estimation problem, I first compare this formu-
lation with prior work. Most previous algorithms that take an “unweighted” approach to
the displacement estimation problem assume that all of the covariance matrices P kij are
uniformly the 2 × 2 identity matrix. Consequently, the maximization of the log-likelihood
function reduces to a standard least-squares problem. However, as Fig. 3.2 and the exper-
iments in Section 3.6 show, such a simplistic covariance approximation for all data points
is typically not a theoretically sound one. A scalar weighting term is allowed in [VRB02],
though no guidance was provided on how to select the value of the scalar.
The weighted estimation problem has some inherent structure that leads to efficiency in
the maximization procedure. Appendix A.1 proves that the optimal estimate of the robot’s
translation can be computed using the following closed form expression.
31
Proposition 1 The weighted scan match translational displacement estimate, p̂ ij , is
nij
X
p̂ij = Ppp ˆi − R̂ij ~u
(Pkij )−1 (~u ˆj ) , (3.15)
k k
k=1
nij !−1
X
Ppp = (Pkij )−1 . (3.16)
k=1
There is not an exact closed form expression for estimating the rotational displacement
φij . However, there are two efficient approaches to computing this estimate. In the first
approach, the translational estimate of Eq. (3.15) is substituted into Eq. (3.11) (or equiva-
lently, into Eq. (3.14)). Since the resulting expression is a function of the single variable φ ij ,
the estimation procedure reduces to numerical maximization over a single scalar variable
φij , for which there are many efficient algorithms.
Alternatively, one can develop (Appendix A.2) the following second order iterative so-
lution to the nonlinear estimation problem:
where
0 −1 qk ˆj
= R̂ij ~u k
J = , (3.18)
1 0 ˆi − p̂ij − R̂ij ~u
pk = ~u ˆj .
k k
Using various experimental data, I have found that this approximation agrees with the exact
numerical solution up to 5 significant digits. However, the approximation is computationally
more efficient to implement.
Props. 1 and 2 suggest an iterative algorithm for estimating displacement. An initial guess
φ̂−
ij for φij is chosen. A translation estimate p̂ ij is computed using Prop. 1. This estimate
32
can be used with an exact numerical optimization procedure or with Prop. 2 to update the
current rotational estimate φ̂− +
ij . The improved φ̂ij is the basis for the next iteration. The
odometry is not necessary for the method to work. An open-loop estimate of the robot’s
displacement based on the known control inputs that generate the displacement will often
provide sufficient accuracy for an initial guess. I show in Section 3.6.1 that the algorithm’s
performance is not hampered by quite large errors in the initial value of the displacement
used as a seed for the algorithm. Note that if odometry does provides the initial guess, there
will be no correlation between the estimate arising from my scan matching algorithm and
the odometry estimate since the accuracy of the latter is not considered in the estimation
process. This simplifies subsequent fusion of these estimates that may be desired for some
applications.
I prefer an iterative algorithm for two reasons. First, nonlinear ML problems are suited
to iterative computation. Second, the correct correspondence between point pairs cannot
be guaranteed in the point correspondence problem (see Section 3.4). This is especially true
in the first few algorithm iterations, where some inaccurate initial pairings are unavoidable.
My iterative approach allows for continual readjustment of the point correspondences as
the iterations proceed.
Letting p̃ij = pij − p̂ij , φ̃ij = φij − φ̂ij (i.e, p̃ij , φ̃ij are translational and the rotational
displacement error estimates), a direct calculation yields the following:
The proofs for Prop. 3 are given in Appendix A.3. For a given sensor, one must derive
appropriate uncertainty models, which are then substituted into the above procedure.
Corollary 4 Matching of distant features (in the limit features at infinite distance from
the current location) minimizes the expected error in the orientation displacement estimate.
In the limit, the relative orientation error is zero.
Note 3: Since all matrices Pkij , k = 1, . . . , nij , in Eq. (3.16) are positive definite, the
covariance of the translational estimate, P pp , can be written as
nij
X
(Ppp )−1 = (Pkij )−1 > (Pkij )−1 ⇔
k=1
Ppp < Pkij , k = 1, . . . , nij . (3.23)
Here I used the notation X > Y to indicate that the difference X − Y is a positive definite
matrix. Eq. (3.23) leads to the following corollary:
34
Corollary 5 Let U ij = mink=1,...,nij Pkij denote the minimum covariance over all corre-
sponding point pairs. The translational covariance estimate P pp given by Eq. (3.16) is
bounded above by U ij : Ppp < U ij .
This corollary states that the covariance of the translational estimate will always be less
than the best single covariance associated with any corresponding point pair.
In order to derive explicit expressions for the covariances of Eq. (3.9), this section develops
models for the errors inherent in the range scan matching process. Most of the models are
quite general, though I do make a few assumptions at some points that are most appropriate
for laser range scanners.
Many range sensing methods are based on the time of flight (e.g., ultrasound and some
laser scanners) or modulation of emitted radiation [AP96, ABL + 01]. The circuits governing
these measurement methods are subject to noise. These effects often can be well-modeled
in a simple way, enabling the simple computation of the covariance contributions NPi and
k
N P j. In Section 2.5.2 I derive a model for general range scan point process noise. I use Eq.
k
(2.34) with a zero bias assumption to define the values of the two noise terms as
N (dik )2 σθ2 2 sin2 θki − sin 2θki σd2 2 cos2 θki sin 2θki
Pki = + , (3.24)
2 − sin 2θki 2 cos2 θki 2 sin 2θki 2 sin2 θki
j j j j
(djk )2 σθ2 2 sin2 θk − sin 2θk σd2 2 cos2 θk sin 2θk
N
Pkj = + , (3.25)
2 − sin 2θkj 2 cos2 θkj 2 sin 2θkj 2 sin2 θkj
where dik and djk are the range vales to the k th scan points sensed from pose i and pose j,
respectively. Similarly, θki and θkj are the heading values for these scan points. I define σ d2
and σθ2 as the variance terms in range and angle for the modeled range sensor.
35
3.3.2 Correspondence Error
Here I analyze the correspondence error described in Section 3.2.1. I then derive a prob-
abilistic approximation to this error. My derivation assumes that the sensor beam strikes
an environmental boundary that is locally a straight line segment (Fig. 3.1). However, this
derivation can be extended to other boundary geometries, or it can serve as an excellent
tangent approximation for moderately curved boundaries.
I first develop a formula for the maximum possible correspondence error that can occur
due to the fact that the exact same boundary points are not sampled in two successive
ˆi
range scans. Consider how nearby scan points will be matched in the vicinity of points ~u k
ˆj in Fig. 3.1. Let
and ~u k
i
δ+ ˆi − ~u
= ||~u ˆi ||, i
δ− ˆi − ~u
= ||~u ˆi || (3.26)
k+1 k k k−1
denote the distance to the adjacent scan points (from pose i’s scan) near the candidate
matching point ~u ˆj − ~u
ˆi (see Fig. 3.1). Similarly, let δ j = ||~u ˆj − ~u
ˆj || and δ j = ||~u ˆj ||
k + k+1 k − k k−1
denote the distances to the adjacent scan points (from pose j’s scan) near the candidate
ˆj . The maximum distance (or error) between any pair of points that are
matching point ~u k
chosen to be in correspondence will be half of the minimum distance between adjacent scan
points. If the error is greater than this value, the point will be matched to another point,
i + δ i )/4
or it will not be matched at all. On average, this error will be the minimum of (δ + −
j j
or (δ+ + δ− )/4. Simple geometric analysis of Fig. 3.1 shows that
i + δi
δ+ − lki sin β 1 1
= +
4 4 sin(αik + β) sin(αik − β)
lki sin β sin αik cos β
= . (3.27)
2 sin2 αik − sin2 β
j j
Substituting j for i yields the analogous formula for (δ + + δ− )/4.
I now propose a probabilistic model for the correspondence errors, and develop explicit
formulas for its first two moments. For simplicity, and without loss of generality, let the
i +δ i < δ j +δ j (i.e., the correspondence error is defined by pose
robot be situated so that δ+ − + −
i). Recall the correspondence error formula of Eq. (3.5): c ij uik − Rij ~ujk − pij . Letting
k = ~
ˆi , the correspondence error is locally a
x be the position along the boundary relative to ~u k
36
function of x. With no correspondence error, x = 0. Since the correspondence error is
locally collinear with the boundary’s tangent, let µ ij ij ij
k = ck · tk be the projection of ck onto
ˆi . The vector tk is positive pointing from ~u
the unit boundary tangent vector, t k , at ~u ˆi to
k k
ˆi . Hence, µij is a signed quantity, and cij = µij tk . The expected value (mean) of the
~u k+1 k k k
i , δ i ] is
error in the interval x ∈ [−δ− +
Z i
δ+
E[µij
k] = µij
k (x)P(x)dx, (3.28)
i
−δ−
where P(x) is the probability that the k th scan point from pose j will be located at x.
I assume that the geometry of the robot’s surroundings is not previously known. There-
fore, it is not possible to know a priori the probabilistic distribution of the correspondence
errors, P(x). I reasonably assume that P(x) has an a priori uniform probability. That is,
ˆj that is matched to ~u
the scan point ~u ˆi could lie anywhere in the interval [−δ i , δ i ] with
k k − +
i + δ i ). Realizing that µij (x) = x in the interval
no preferred location. Hence P(x) = 1/(δ + − k
i , δ i ], evaluation of Eq. (3.28) yields
[−δ− +
i )2 − (δ i )2
(δ+ −
E[µij
k] = i + δi
i
= δ+ i
− δ−
δ+ −
lki sin2 β cos αik
= −2 . (3.29)
sin2 αik − sin2 β
Note that when the incidence angle is not normal (α ik 6= 90o ), the mean is non-zero. How-
ever, since the mean is proportional to sin2 β, this term is negligible when the magnitude
of β is small. Hence, the correspondence error can be practically considered to be a zero-
mean quantity when β is small (this holds for the experiments described in Section 3.6).
To compute the variance of the correspondence error (using the zero-mean assumption),
Z i
δ+ i )3 + (δ i )3
x2 (δ+ −
E[(µij 2
k) ] = i + δi dx = i + δi ) . (3.30)
i
−δ− δ+ − 3(δ + −
Letting ηki = αik + θki , and keeping the above results in mind, the covariance of the corre-
37
spondence error, CP i of Eq. (3.9), can be found as
k
C
Pki = E[cij ij T ij 2
k (ck ) ] = E[(µk ) ]tk tk
T
(3.31)
i )3 + (δ i )3 cos 2 ηi cos η i sin η i
(δ+ − k k k
= .
3(δ+i + δi ) i i 2 i
− cos ηk sin ηk sin ηk
Note that this expression is a function of the sensor beam’s incidence angle, α ik . In Section
3.5 I discuss how to estimate this quantity from the range scan data.
Because I do not want to assume prior knowledge of the environment’s geometry, I
consider the correspondence errors to be independent. This assumption is conservative in
that there is no assumption of structure in the environment beyond the immediate geometry
of the local point pairs. It would be possible to predict subsequent correspondence errors
along a wall (or other regular geometric structure) given the knowledge that the subsequent
corresponding point pairs did indeed come from the same exactly straight wall. With a
proper line fitting method (e.g., see [PRB03]), the correlations between correspondence
errors could be estimated from the line fitting method’s uncertainty model. 3
In general, knowing that adjacent corresponding pairs lie along a common wall will
significantly reduce the magnitude of Eq. (3.30), which in turn will lead to lower variances
for most of the points along the wall. In this case, the correspondence error variance becomes
dominated by the uncertainty in the wall’s geometry, which in turn is a function of the line
fitting method. These effects can fit easily within my framework if desired, leading to even
better displacement estimates and tighter estimate covariances. However, I choose to take
a conservative approach where it is not assumed that the robot’s surrounding geometry is a
priori known. Moreover, since the reduction in uncertainty will only occur for points along
one line (or other geometric feature), in even modestly complex environments, the amount
of precision to be gained by using this approach is unlikely to be worth the complexity of
implementing these more advanced methods.
Range measurement bias is an artifact of some range sensing methods (e.g., see [AP96]).
Since bias models will strongly depend upon the given range sensing method, it is not
3
In the case of correspondence error correlations, the likelihood model of Eq. (3.10) will no longer take
a product form. The form of the likelihood model in this case will depend upon the line fitting method.
38
possible to give a complete summary of bias models for common sensing methods. Instead,
I consider a general approach for calculating the effect of bias on the displacement estimate.
As introduced in Section 2.5.2, Eq. (2.36), the point bias approximations ~bik and ~bjk can
be decomposed into the following form:
4
To analyze the bias effect, let ε̃ij ij ij ij
oik − Rij ~ojk is the total constant
k = εk + õk , where õk = ~
(that ignored the constant bias term). Incorporating the bias offsets, the likelihood function
takes the form
where P̃kij is the covariance matrix with bias uncertainty taken into account:
P̃kij = Q̃ij ij T
k + Rij S̃k Rij , (3.34)
i BP j ,
where Q̃ij ij B ij ij
k = Qk + P k and S̃k = Sk + k with BP i
k = E[δ~bik (δ~bik )T ] and BP j
k =
E[δ~bjk (δ~bjk )T ]. That is, the covariance formula is updated to include uncertainty in the
bias term. To obtain these results, it is again assumed that the bias noise is uncorrelated
with the range measurement noise and the correspondence error (since variance in bias is
typically a function of the variability of the surface properties, rather than measurement
noise).
Following the derivations that lead to Prop. 1, one can show that the translation
estimate in this case is
nij
X
p̂ij = P̃pp ˆi − R̂ij ~u
(P̃kij )−1 (~u ˆj + õij ) . (3.35)
k k k
k=1
Formulas analogous to Eq. (3.17) can be derived for the orientation estimate as well. The
previous covariance formulas take the same structure, with Q ij ij ij
k and Sk modified to Q̃k
and S̃kij (i.e., to include possible bias uncertainty terms). Clearly, Eq. (3.35) shows that
39
bias effects can influence the displacement estimate. However, bias models can be used to
compensate for bias effects in the estimate.
The focus of this work is to improve displacement estimation via more accurate consid-
erations of the noise and uncertainty inherent in the estimation process. However, the
displacement estimation process clearly depends upon the ability to successfully match cor-
responding points from range scans taken in adjacent poses. In order to isolate the benefits
of my estimation method, I use a simple “closest-point” rule similar to the one in [LM97b].
ˆi } and {~u
Given two scan sets {~u ˆj }, the outliers are removed in the first step. These are
k k
the points visible in one scan, but not in the other (see [LM97b] for details). After removing
the outliers, the algorithm attempts to find correspondences between scan point pairs in
the two poses. For every point in pose i, the algorithm searches for a corresponding scan
point in pose j that satisfies a range criterion: the corresponding point must lie within a
ˆi − ~u
given distance: ||~u ˆj || < d. If no points in pose j satisfy this criterion, then the point is
k k
The correspondence error model of Section 3.3.2 assumes knowledge of each scan point’s
incidence angle. While any method of incidence angle estimation can be used, I have chosen
40
a method that estimates the local geometry of the scan points using a Hough transform. The
Hough transform [DH72] is a general pattern detection technique that we use to determine
an estimate of the supporting line segment about a point. The incidence angle can then
be estimated from the configuration of the line segment. In the general Hough transform
line finding technique, each scan point {x k , yk } is transformed into a discretized curve in
the Hough space. The transformation is based on the parametrization of a line in polar
coordinates with a normal distance from the line to the origin, d L , and a normal angle, φL
Values of φL and dL are discretized with φL ∈ {0, π} and dL ∈ {−D, D}, where D is the
maximum sensor distance reading. The Hough space is comprised of a two-dimensional hash
table of discrete bins, where each bin corresponds to a single line in the scan point space.
For each scan point, the bins in Hough space that correspond to lines passing through that
point are incremented. Peaks in the Hough space correspond to lines in the scan data set.
As the bins in the Hough space are incremented, a history of the contributing scan point
coordinates is maintained in the bin, so that when a peak is determined to represent a line,
the contributing set of points can be recovered. The incidence angles can then be estimated
for every point in the line.
The algorithm is only precise up to the level of discretization chosen for the line pa-
rameters. Both computational complexity and the memory needed for the hash table grow
with finer discretization so it is important to establish a reasonable balance between pre-
cision and computing resources. For my implementation I found a line angle measurement
precise to the nearest degree to be adequate for incidence angle estimation. Discretization
in distance was set to 10 mm, though this choice of this value is less significant as I am only
using the orientation of the fit lines.
The Hough transform is not limited only to straight line detection. It can also be used
to detect and fit simple curves such as circles and ellipses and even arbitrary shapes [Bal81].
The tangent vectors to these curves (and subsequently the incidence angle) can then easily
be estimated from the transform. For most indoor environments the line fitting method is
sufficient to determine incidence angles. More accurate line fitting methods (e.g., [PRB03]
and references therein) can be used to get more accurate estimates of incidence angle, but
41
the extra computation is typically not balanced by sufficiently better estimation accuracy.
For points that are not found to be clustered into a line, an incidence angle estimate
is not calculated. These points are weighted only according to the computed measurement
noises such that the covariance of the matching error at the k th point correspondence of
poses i and j from Equation (3.8) becomes
4
Pkij = N Pki + Rij N Pkj Rij
T
, (3.37)
I implemented my method on a Nomadics 200 mobile robot equipped with a Sick LMS-
200 laser range scanner. This sensor measures the range to points in a plane at every
half degree over a 180-degree arc, as seen in Figure 3.2. For the purpose of comparison,
I implemented an unweighted least-squares scan matching algorithm analogous to that of
Lu and Milios [LM97b], hereafter called the “UWLS.” Both the weighted and unweighted
estimation algorithms used the same point correspondence algorithm so that the comparison
could fairly focus on the relative merits of both estimation schemes. Section 3.6.1 compares
the robustness and accuracy of the algorithms in four different environment geometries.
Section 3.6.2 compares results from two longer runs. Section 3.6.3 presents the estimated
computational costs of the algorithms. My experiments used the values β = 0.5 o , σl = 5
mm, and σθ = 10−4 radians obtained from the Sick LMS-200 laser specifications.
The experiments reported in this section focus on two aspects of estimation performance:
the robustness with respect to errors in the initial displacement estimate that seeds the
iterations of the algorithm, and the accuracy of the displacement estimates. A more robust
algorithm can successfully recover from a wider range of errors in the initial displacement
guess. In practice, such errors in the initial displacement estimate come from large odometry
errors, or might arise in the absence of odometry when the initial guess is provided by an
open loop estimate of the robot’s motion response.
42
Unperturbed Trial: Unperturbed Trial:
Final Error in Final Error in
Position (mm) Orientation (mrad)
Test Weighted UWLS Weighted UWLS
Fig 3.3 0.19 1.33 0.23 8.8
Fig 3.4 1.5 3.6 0.43 1.4
Fig 3.5 2.5 9.8 0.57 16.0
Fig 3.6 1.8 4.1 0.0334 0.31
Table 3.2: Statistics for perturbed trials in the robustness and accuracy comparison tests.
To test for robustness, I ran each algorithm through multiple trials with the same pair
of scans, each time only perturbing the initial displacement guess. Some initial guesses were
sufficiently poor that the algorithm converged to an erroneous solution. An estimate was
deemed successful when the true measured displacement lay within the 3σ deviation range
as defined by the algorithm’s calculated covariance (the UWLS covariance was calculated
using the formula given in [LM97a]). The initial displacements ranged from 0 to 600 mm
at 8 radial directions (every π/4 radians) at increments of 200 mm in position, and ranged
from -0.6 o 0.6 radians in orientation, at increments of 0.02 radians. For each of the 25
discrete initially perturbed positions, I tested 61 initially perturbed orientations to generate
1,525 unique initial condition perturbations. These perturbations were added to the true
displacement to create initial conditions for the 1,525 trials for each algorithm and each
environmental condition described below.
I also compared the overall accuracy of each algorithm’s displacement measurement.
The true displacements were measured by hand with an uncertainty of less than 2 mm in
displacement and 0.002 radians in orientation. I ran this robustness and accuracy test over
four different scan pairs.
43
A
Robot Pose
Unweighted
989 estimates (64.9%)
converge to within 3σ Weighted
of true displacement 1388 estimates (91.0%)
converge to within 3σ
of the true displacement
Figure 3.3: A) Experiments with initial displacement perturbations between scans taken at
a single pose. B) Close-up of robot pose with results.
The first experiment shown in Fig. 3.3 tests for robustness and accuracy while isolating the
effects of my modeling of the point correspondence error (Section 3.3.2). In this test, two
scans were taken from the exact same robot pose (i.e., the robot was not moved between
scans), with one scan comprised only of the even scan points and the second scan comprised
only of the odd scan points. In this way, correspondence errors are artificially introduced
into the two scans.
The two scans and the initially perturbed positions are shown in Fig. 3.3A. The dis-
placement estimates of the successfully converged estimates are shown in Fig. 3.3B. The
results of the two runs with unperturbed initial guesses are shown with boldfaced markers,
44
along with the 3σ uncertainty boundary of these estimates (shown as dashed ellipses). Of
the 1,525 runs with initial displacement perturbations, my algorithm converged successfully
in 91.0% of the cases while the UWLS algorithm was successful in 64.9% of the cases. The
average error for successful weighted estimates was 0.63 mm and 0.00079 radians while the
average error for successful UWLS algorithm estimates was 1.8 mm and 0.0086 radians.
The error for the case when the initial displacement guess was unperturbed was 0.19 mm
and 0.00023 radians for my weighted algorithm and 1.33 mm and 0.0088 for the UWLS
algorithm. Though the true displacement between the poses was exactly zero (since the
scans were taken at the same robot pose), due to the even/odd nature of the scans no two
corresponding scan points sample the exact boundary points of the environment. The effect
of this correspondence error on the UWLS algorithm can be visualized in the presence of
three distinct local minima in Fig. 3.3B. This multi-modal result surrounding the value is
often seen in UWLS algorithm robustness test results.
Fig. 3.4 shows results from initial condition robustness testing on two scans taken in our
lab with true position and orientation displacements of 683 mm and 0.467 radians. Fig.
3.4A shows the robot poses and scans under consideration, as well as the initial perturbed
displacement guesses. Fig. 3.4B shows the results obtained by starting the algorithms from
the 1,525 different initial displacement perturbations. My algorithm successfully converged
in 82.0% of the cases while the UWLS algorithm was successful in 56.9% of the cases. The
average error for successful weighted estimates was 1.8 mm and 0.00067 radians while the
average error for successful UWLS algorithm estimates was 6.0 mm and 0.0026 radians.
The error for the case when the initial displacement guess is unperturbed was 1.5 mm and
0.00043 radians for my weighted algorithm and 3.6 mm and 0.0014 for the UWLS algorithm.
Fig. 3.5 shows the results of the same type of testing performed on a pair of scans in
which the environment changed between scans. Note that the horizontal double wall on
the lower left side of the figure is actually a table at almost exactly laser height. The first
scan sampled the wall behind the table while the second scan sampled the front edge of the
table due to small changes in floor geometry. The additional nearby obstruction to the left
45
A
Pose 1
Pose 2
Weighted
1252 estimates (82.0%)
converge to within 3σ
of the true displacement
Initially unperturbed
Pose 2
unweighted estimate
Initially unperturbed
weighted estimate
True Pose 2 Displacement
Unweighted Displacement Estimates
Weighted Displacement Estimates
Pose 2 Measurement Covariance (3σ)
Unweighted Covariance (3σ)
Weighted Covariance (3σ)
−280 −278 −276 −274 −272 −270 −268 −266 −264
(mm)
Figure 3.4: A) Experiments with initial displacement perturbations between scans taken at
different poses. B) Close-up of pose 2 with results.
of the robot was caused by a person who moved between the two scans. The range points
associated with these non-repeating objects represent 29.2% of the total number of scan
points. For the 1,525 trials with different initial displacement perturbations, my algorithm
was successful in 95.5% of the cases, while the UWLS algorithm was successful in 31.2%
of the cases. The average error for successful weighted estimates was 2.5 mm and 0.00057
radians while the average error for successful UWLS algorithm estimates was 11.1 mm and
0.016 radians. The error for the case when the initial displacement guess is unperturbed is
2.5 mm and 0.00057 radians for my weighted algorithm and 9.8 mm and 0.016 for the UWLS
algorithm. These results show that my method’s emphasis on weighting each scan point
results in superior robustness to the presence of a significant number of non-corresponding
46
A Pose 1
Moving Person
Pose 2
−6000 −5000 −4000 −3000 −2000 −1000 0 1000 2000 3000 4000
(mm)
range points.
Fig. 3.6 shows the results of analogous testing done in a nearly symmetrical hallway.
In a perfectly symmetrical hallway with no discernible details along the walls, no scan-
based algorithm can effectively correct initial displacement errors in the direction along the
hallway’s main axis. In this test, a single door is open at a slight angle on the left side of
the hallway. The presence of this feature allows for possible scan matching convergence.
For the set of 1,525 initial displacement perturbations, my algorithm successfully converged
in 75.1% of the cases while the UWLS algorithm was successful in only 3.0% of the cases.
The average displacement estimate error for the successful weighted estimates was 3.1 mm
and 3.92 ∗ 10−5 radians while the average error for successful UWLS algorithm estimates
47
A B Closeup : Pose 2
Pose 1
Pose 2
Unweighted
46 estimates (3.0%)
converge to within 3σ
of true displacement
Initially
unperturbed
unweighted
estimate
Pose 2
Initially
unperturbed
weighted
estimate
Weighted
1145 estimates (75.1%)
converge to within 3σ
of true displacement
was 14.5 mm and 0.00047 radians. The error for the case when the initial displacement
guess is unperturbed is 1.8 mm and 3.34 ∗ 10 −5 radians for my weighted algorithm and
4.1 mm and 0.00031 radians for the UWLS algorithm. In effect, the weighted algorithm
better uses the hallway’s small non-symmetries to correct the position estimation along
the hallway axis. This significantly better performance is primarily due to my approach of
modeling the correspondence errors, which discounts the contributions along the hallway’s
axis (since there is very low certainty in that direction). Instead, the small asymmetries are
effectively accentuated. Conversely, in the UWLS algorithm the contributions of the non-
symmetries are effectively lost, resulting in very poor correction of position errors along the
48
hallway. The plots of the uncertainty ellipses in Fig. 3.6B also show how only my weighted
algorithm’s calculated covariance reflects a greater uncertainty in the direction parallel to
the hallway, as would be expected.
The above results showed not only the improvement in robustness of my algorithm over the
UWLS algorithm, but also a significant improvement in the overall accuracy of the successful
final displacement estimates. This improvement in accuracy is best seen in longer runs with
multiple displacement estimates added end to end.
Fig. 3.7 shows a 32.8-meter loop path consisting of 109 poses with the final pose the same
as the starting pose. Because of the difficulty of hand measuring each pose I analyze and
compare only the initial and final positions. For each step the current and previous scans
are processed by each algorithm with odometry supplying the initial guess, and updated
displacement and covariance estimates are calculated. In order to maintain statistical inde-
pendence in my estimates, two scans were taken at each pose, with scan 1 used to match
with the pose behind and scan 2 used to match with the pose ahead. In practical ap-
plications, such a dual scan procedure would not be necessary, as a Kalman filter could
incorporate the scans while accounting for the correlation between successive displacement
estimates. However, I do not use that approach here so that I can focus directly on the
properties of the displacement estimate, and not worry about the impact of the filter on my
results.
In order to close the loop, the second scan taken at the last pose is matched with the
first scan taken at the initial pose. Therefore a perfect series of displacement estimates
added tip to tail would result in exactly a zero overall displacement estimate. For the run
shown in Fig. 3.7, the final odometry error is 1.817 meters and 0.06 radians. The final
UWLS algorithm error is 0.271 meters and 0.021 radians while the final weighted algorithm
error is 0.043 meters and 0.0029 radians. The ratio of the final translation error to total
path length is 5.54% for odometry, 0.82% for the UWLS algorithm, and 0.131% for my
weighted algorithm. Perhaps more importantly, as shown in Fig. 3.7B, the final covariance
49
calculation for my algorithm clearly encompasses the true final pose within the 3σ bounds,
while the covariance calculation of the UWLS algorithm does not.
This improvement over the UWLS algorithm is even more pronounced in the presence of
poor odometry estimates. Fig. 3.8 shows an actual run where one of the odometry readings
was substantially corrupted as the robot rolled over a doorjamb when heading into the room
in the upper right hand corner of the plot. This path is a 24.2-meter loop consisting of 83
poses with the scans taken and loops closed as in the previous path. For the path shown
in Fig. 3.8 the final odometry error is 1.040 meters and 0.354 radians. The final UWLS
algorithm error is 0.919 meters and 0.200 radians while the final weighted algorithm error
is 0.018 meters and 0.013 radians. The ratio of the final translation error to total path
length is 4.30% for odometry, 3.80% for the UWLS algorithm, and 0.074% for the weighted
algorithm.
I implemented both algorithms in Matlab and analyzed their computational demands using
the Matlab Profiler on a desktop computer with a Pentium 4, 1.80 GHz CPU with 512
MB RAM. Within each iteration, computation is divided between the point correspondence
phase (which usually consumes the bulk of the computation) and the estimation phase.
The number of iterations required to reach convergence also affects the overall cost of
computation.
In the 109 steps of run 1 shown in Fig. 3.7, the correspondence method used on
both algorithms comprises 81.0% of the total UWLS algorithm computation time of 0.112
seconds/iteration and 44.3% of my weighted algorithm computation time of 0.205 sec-
onds/iteration. For the relatively low initial odometry errors in run 1, the UWLS algorithm
converges in an average of 12.78 iterations for an average computation time of 1.43 seconds
per displacement while my algorithm converges in an average of 10.36 iterations with a total
average computation time of 2.12 seconds per pose displacement. For larger initial odom-
etry errors, especially in orientation, the difference in iterations to convergence increases
to the point where my weighted algorithm is actually faster than the UWLS algorithm.
For the data shown in Fig. 3.4, when the orientation error is greater than 0.2 radians the
50
UWLS algorithm converges in an average of 42.98 iterations for an average computation
time of 4.81 seconds per displacement while my weighted algorithm converges in an average
of 22.60 iterations for an average computation time of 4.63 seconds per displacement.
In summary, my experiments show that in real world indoor environments, my method
provides significantly greater estimation accuracy and robustness as compared to an un-
weighted approach without a significant increase in computational cost. Clearly, the com-
putational demands in the estimation phase are larger for my algorithm (as compared to
an unweighted algorithm). However, since the computations required by the estimation
part of the algorithm account for only a small portion of each iteration, and my algorithm
often converges in fewer iterations compared to the UWLS algorithm, the total run time is
reduced.
This chapter introduced a new method for estimating robot displacement based on dense
range measurements. In particular, I investigated the effects of different error and noise
sources on the convergence and accuracy properties of these motion from structure algo-
rithms. My experiments showed that careful attention to the details of error modeling can
significantly enhance overall displacement and covariance estimation accuracy.
The first part of the chapter gave a general formulation of the displacement estimation
problem using weighted point pair correspondences. A general solution to the estimation
problem and formulas for the covariance of the displacement estimate were then derived.
I gave general models for range measurement noise, bias error, and correspondence error
to apply to this problem. Although parts of this analysis were mainly aimed at planar
laser range sensors, the methods can likely be extended to algorithms for non-planar laser
scanners [LR02, JM00], where detailed uncertainty modeling has not been considered, and
other range sensors such as stereo cameras, radar, ultrasound, etc. These techniques should
also be useful for methods that use both planar laser range finders and cameras to estimate
three-dimensional motion parameters [TG02, SNC97]. The specifics of the analysis must
be modified to incorporate the appropriate error/noise models for each particular sensor.
The accurate displacement estimates afforded by this method can be fused with odome-
try estimates [RB02] to provide better robot localization capability. Similarly, the improved
51
displacement estimation afforded by this method should in the future lead to more accurate
mapmaking and localization procedures.
52
−4000 −3000 −2000 −1000 0 1000 2000 3000 4000 5000 6000 7000
(mm)
Figure 3.7: A) A 109-pose, 32.8-meter loop path. B) Close-Up of final path poses, shown
the covariance estimates of the weighted and unweighted algorithms.
53
Figure 3.8: A) An 83-pose, 24.2-meter loop path. B) Close-up of final loop poses.
54
Chapter 4
This chapter presents line feature-based methods of mobile robot mapping and localization.
First the general advantages of feature based approach are outlined, followed by the moti-
vation for my specific choice of feature representation. Methods are then presented for line
feature extraction and comparison which achieve a significant improvement over prior work
through the rigorous treatment of sensor and feature uncertainty. This approach allows for
more accurate and detailed mapping and localization methods as presented in the results
at the end of this chapter.
A line segment is a simple feature. Hence, line-based maps represent a middle ground be-
tween highly reduced feature maps and massively redundant raw sensor-data maps. Clearly,
line-based maps are most suited for indoor applications, or structured outdoor applications,
where straight-edged objects comprise many of the environmental features. My novel ap-
proach to line segment feature extraction and comparison also allows for probabilistically
accurate treatment of much shorter line segments than is seen in prior work, even down to
56
single points. These shorter segments can then be used to localize the robot from aspects
of the environment with arbitrary curved contours. The result is a hybrid feature represen-
tation that takes advantage of a structured environment but has the flexibility to represent
every point in the data set.
The idea of fitting lines to range data is not a new one. The solution to the problem of
fitting a line to a set of uniformly weighted points can be found in textbooks (e.g., [P + 92],
[FP02]), and others have presented algorithms for extracting line segments from range data
(e.g., [MNRS97, BA00, GMR98]). However, since these algorithms do not incorporate noise
models of the range data, the fitted lines do not have a sound statistical interpretation.
Several authors have used the Hough transform to fit lines to laser scan or sonar data (e.g.,
[JC98, FLW95, IN99]), but the Hough transform alone does not take noise and uncertainty
into account when estimating the line parameters.
The recursive mode of the Kalman-Filter was used by Ayache and Faugeras [AF89] to
extract and fit line segments to groups of noisy pixels, and has since been applied to range
data in [RB00b] and [CT99]. The methods in [AF89] and [RB00b] both specify constant
weighting for all point contributions. Castellanos and Tardos in [CT99, CMNT99] account
for the individual point uncertainties in estimating the parameters of the line. However,
they choose to calculate the covariance of the line parameters using an ad-hoc approach
that uses only the uncertainty of the line segment endpoints, and ignores the uncertainty
contribution of the interior points.
To my knowledge, the line fitting procedure presented here for a polar line representation
in the case of range data with varied uncertainty is new. A key feature and contribution of
this approach are the concrete formulas for the covariance of the line segment fits, and the
allowance of individual weighting of each measured point. This accurately modeled feature
uncertainty allows other algorithms that use the line-maps to appropriately interpret and
incorporate the line segment data. As an example, a simple set of weighted points and the
corresponding feature is shown in Fig. 4.1.
Ayache and Faugeras, as well as Castellanos and Tardos, also present methods to merge
line segments across multiple scans using a Kalman filter. Ayache and Faugeras use line
representation based on endpoint coordinates, which is far less robust to errors from occlu-
sion when mapping a real-world environment. Their focus is primarily on three-dimensional
mapping using stereo vision sensors. Castellanos and Tardos focus on mapping from planar
57
0 0
−200 −200
−400 −400
−600 −600
−800 −800
−1000 −1000
−1200 −1200
−1400 −1400
1000 1200 1400 1600 1000 1200 1400 1600 1800
Figure 4.1: Example of line segment fit: data points (left) and fitted line with a represen-
tation of its uncertainty (right).
range scans, and use a similar polar line-based representation. The work of Castellanos and
Tardos therefore offers the best point of comparison for the following benefits of my work:
1) The more accurate feature covariances computed from my method, allow for more
accurate line-based merging and mapping in a statistically sound fashion.
2) The rigorous treatment of the feature uncertainty allows line segments with high
orientation uncertainty to be accurately compared and merged. Castellanos and Tardos
make a small angle assumption in the orientation uncertainty, and their mapping methods
are therefore restricted to longer line features. Points that don’t form long lines are ignored.
The added flexibility of my approach enables the use of shorter line segments that can
describe more arbitrary contours, and allow for the use of the entire data set when localizing
and mapping.
3) Unlike prior line-based mapping methods, my methods allow for partial feature corre-
spondence, which enables three results for the matching hypothesis tests for a line segment:
correlation of just the underlying infinite line; correlation of the line and one endpoint; and
correlation of the line and both endpoints. The merging and mapping methods take advan-
tage of the partial correspondences by selectively merging the portions of the features that
match. In the line-based mapping methods of Castellanos and Tardos, the possibility of
endpoint correspondences are ignored. Alternatively the methods of Ayache and Faugeras,
58
which require endpoint correlation for a positive feature match, would be susceptible to
missed matches due to occlusion effects.
4) In Chapter 5, I augment this feature representation and develop a multi-scale ap-
proach to mapping and localization. The single-scale methods developed in this chapter
form the foundation on which I build multi-scale algorithms that introduce further im-
provements to existing approaches.
Chapter Overview
This chapter first introduces the line feature representation and associated definitions.
Methods to extract these features from range point data are then presented in two steps:
1) Initial detection using a Hough transform based approach to collinear point grouping,
and 2) Optimal line fitting using a novel approach that computes feature parameters using
a detailed model of the sensor noise. Next, methods of establishing correlations of features
across scans are defined, followed by methods to merge lines and examples. The Kalman
filter equations are then derived as the basis for a line feature based SLAM method. Results
are presented for the SLAM method along with comparisons with the analogous approach
of Castellanos and Tardos.
This section introduces a basic line feature representation. My specific choice of feature
representation is motivated by the need for feature robustness in real-world sensor-based
applications. Even small changes in robot position can result in large differences in the raw
sensor data due to occlusion effects. A changing environment also can greatly increase the
magnitude of discrepancies across data collected at different times. A robust feature can be
defined as one that can be reliably and repeatably detected in a given environment in the
presence of occlusion effects and a moderately changing environment.
The primary parameters I use for feature comparison are the orientation and normal
position of the underlying infinite lines of features extracted from range data. As I will show
in my experimental results, these parameters are very robust when compared across range
scans, especially with regard to occlusion effects. I use a two parameter, polar representation
for the underlying infinite line. The line segment representation augments the infinite
59
line representation with added endpoint parameters. I will also introduce the uncertainty
representation for the parameters of each of these features and specifically address the
nonlinearities associated with feature orientation uncertainty.
L
ρ
α
The core of my feature representation is the polar form of the infinite line. Not only is
the polar form of the line a minimal representation of the two-dimensional feature, but it
also allows for easy comparison of line orientation and normal position. I define the infinite
line L as
α
L= , (4.1)
ρ
where ρ and α represent the magnitude and heading of the vector that extends from the
origin to the line and is perpendicular to the line. Thus α defines the orientation of line L,
while ρ defines the normal distance to the origin or the position of the infinite line L. See
Figure 4.2 for a graphical representation.
Let the estimated line orientation measurement α̂ be defined as the sum of the “true”
orientation α and an error term εα :
α̂ = α + εα . (4.2)
Similarly let the line position measurement ρ̂ be defined as the sum of the “true” position
ρ and an error term ερ :
ρ̂ = ρ + ερ . (4.3)
60
I assume the error terms εα and ερ to be zero-mean, Gaussian random variables. In
Section 4.3 I derive methods for calculating these error terms based the feature extraction
process and justify the normal distribution assumption. I define the covariance matrix
associated with line L as follows:
PL = E εL (εL )T , (4.4)
so
E ε2α E [εα ερ ] Pαα Pαρ
PL = = . (4.6)
E [ερ εα ] E ε2ρ Pρα Pρρ
Pρρ is the variance in the position ρ of the line, P αα is the variance in the orientation α of
the line, and Pρα and Pαρ are the cross-correlation terms. Since the covariance matrix is,
by definition, positive definite and symmetric, P ρα = Pαρ .
In Sections 2.5.1 and 2.5.1 I reviewed basic frame transformations for poses in SE(2) and
their covariances. In this section I outline the frame transformations for an infinite line
feature L and covariance PL .
Coordinate Transformation
Consider an infinite line measured in this local reference frame i where L i = [αi , ρi ]. To
61
L
ρ
ρ i
0
αi
yi pose i φi
α0
xi
Figure 4.3: Line coordinates with respect to a global frame and a frame at pose i.
transform the line coordinates into a global frame, where the line is denoted by L 0 ,
α0 αi + φ i
L0 = = , (4.8)
ρ0 ρi + δρi
where δρi is the projection of the displacement g i into the dimension normal to the line L i ,
and is defined as
δρi = xi cos(αi + φi ) + yi sin(αi + φi ), (4.9)
L δψ
i
ρ ρ
i
0
αi
pose i
α0
δρi
Figure 4.4: Line coordinates with respect to a global frame and a frame at pose i. The
values δρi and δψi represent the pose i displacement in the “ρ − ψ” frame.
62
Covariance Transformations
Consider a covariance matrix PLi of infinite line Li = [αi , ρi ] measured with respect to a
pose gi . The matrix PLi can be transformed to the global frame at pose as follows:
with
1 0
Hi = , (4.11)
δψi 1
It is important to note that this is a translation transformation and that the eigenvalues
of the covariance matrix are independent of the reference pose. Carrying through this
transformation yields
1 0 Pαα Pαρ 1 δψi
P L0 = (4.13)
δψi 1 Pρα Pρρ 0 1
Pαα Pρα + δψi Pαα
= .
Pρα + δψi Pαα Pρρ + 2δψi Pρα + (δψi )2 Pαα
It follows that for a general Pgi defined in Eq. (2.21), one can further compute
Pxx cos2 (α0 ) + 2 cos(α0 ) sin(α0 )Pxy + sin2 (α0 )Pyy Pxγ cos(α0 ) + Pyγ sin(α0 )
P̂g0 =
Pxγ cos(α0 ) + Pyγ sin(α0 ) Pγγ
P̂ρρ P̂ργ
= . (4.18)
P̂γρ P̂γγ
^
P ρρ
pose i
Pg i
In Fig. (4.5) the position covariance ellipse for P gi is shown, as well as the projection of
this uncertainty P̂ρρ in the direction normal to the line. I can then define the transform of
the local covariance measurement to a global notion of uncertainty for the infinite line by
64
combining the pose and line covariances as follows:
Eq. (4.11) defined a transformation H i of the covariance of line Li from reference frame gi
to reference frame g0 . For any line covariance matrix PL there exists a special value of H
defined as follows:
1 0
HP = , (4.20)
δψP 1
or
σα2 0 1 0 1 −δψP
= PL , (4.23)
0 σρ2 −δψP 1 0 1
where σα2 and σρ2 represent variances with respect to a reference frame at which the random
~P as the point along line L that lies at
variables α and ρ are independent. I then define V
this point of random variable independence for α and ρ. I will refer to this point as the
center of rotational uncertainty and it can be computed for line L(α, ρ) and P L as follows:
xP ρ cos(α) − δψP sin(α)
~P =
V = . (4.24)
yP ρ sin(α) + δψP cos(α)
Figure 4.6 shows a representation of line L with uncertainty bounds. The figure rep-
resents the orientation uncertainty bounds by red dotted lines crossed at the center of
~P . The uncertainty bounds in position ρ are represented by the
rotational uncertainty V
65
L ψP
σα
VP ρ
σρ
α
Figure 4.6: Infinite line and line covariance representation. ~u P represents the center of
rotational uncertainty.
red dotted lines parallel to the original line. The green hyperbola represents the combined
uncertainty bounds for position and orientation of the line, with asymptotes defined by σ α
and the distance between curves defined as 2σ ρ . Note the graphical representation of ψ P
~P from the intersection at the closest approach
as the projected distance along the line to V
vector.
ψb
ρ S
ψa
axis
+ρ
+ψ
axis
α
ρ
ρψ frame
My notion of a segment feature builds upon the infinite line feature developed above.
The ends of a line segment coincident with the infinite line are represented as a scalar value
pair [ψa , ψb ]. For a given feature with orientation α, I define a “ρ − ψ” frame by rotating
the origin frame through angle α as shown in Figure 4.7. The scalar values ψ a and ψb are
measured with respect to the ψ axis and can take positive or negative values. I define a line
66
segment S as
α
ρ
S= . (4.25)
ψa
ψb
ψ b2
ψ a2
S ψ b1
ψ a1
ρ
S
α
Note that segment S can be augmented with additional endpoint pairs to represent, in
a single feature, multiple segments that share the same underlying infinite line.
α
ρ
ψa1
S= ψb1 , (4.26)
...
ψan
ψbn
where n is the number of endpoint pairs. See Figure 4.8 for a graphical example with n = 2.
For future plots, derivations, and experimental results I assume n = 1 when referring to
line segment features, unless stated otherwise.
67
4.2.6 Line Segment Covariance
Let the estimated line endpoint measurements ψ̂a and ψ̂b , be defined as the sum of the
“true” distances ψa ,ψb and error terms εψa , εψa :
and
ψ̂b = ψb + εψb . (4.28)
The error terms εψa and εψa are assumed to have a zero-mean, Gaussian distribution.
See Section 4.3.5 for more details of how these error terms are modeled in the feature
extraction methods. The line segment covariance matrix P S can be written as
PS = E εS (εS )T , (4.29)
E εL (εL )T E [εL εψa ] E [εL εψb ] PL PLψa PLψb
h i
PS = E εψa (εL )T E ε2ψa E [εψa εψb ] = P ψa L P ψa ψa P ψa ψb ,
h i
E εψb (εL )T E [εψb εψa ] E ε2ψb P ψb L P ψb ψa P ψb ψb
where PL is the covariance of the underlying infinite line as defined in Eq. (4.6).
Here I extend the frame transformations introduced above in Section 4.2.3 for infinite lines
to account for the augmented line segment representation with endpoint terms.
68
Coordinate Transformation
Consider a line segment Si measured in a local frame with respect to pose i and transform
the line segment coordinates to the global frame as follows:
α0 αi + φ i
ρ0 ρi + δρi
S0 =
0 = i
,
(4.31)
ψa ψa + δψi
ψb0 ψbi + δψi
where δρi and δψi are the coordinates of the displacement g i projected into the 00 α − ρ00
frame as defined above in Eqs. (4.9) and (4.12), respectively.
Covariance Transformations
Consider a covariance matrix, PSi , of the line segment Si measured with respect to an
uncertain pose gi whose pose covariance matrix is Pgi with respect to the global reference
frame. The matrix PSi can be transformed to the global frame at pose i as follows:
1 0 0 0
δψi 1 0 0
H Si =
,
(4.33)
−δρi 0 1 0
−δρi 0 0 1
69
where δψi and δψi are defined in Eqs. (4.9) and (4.12). Using the same approach outlined
in Eq. (4.17) define KSi to be
0 0 1
cos(−αi + φi ) −sin(−αi + φi ) 0
1 0 0
K Si
=
(4.34)
sin(−αi + φi ) cos(−αi + φi ) 0
0 1 0
0 0 1
0 1 0
0 0 1
cos(αi + φi ) sin(αi + φi ) 0
=
.
(4.35)
−sin(αi + φi ) cos(αi + φi ) 0
−sin(αi + φi ) cos(αi + φi ) 0
For completeness the notion of rotational uncertainty for a line introduced in Section 4.2.4
is extended to a line segment. The equations still hold for the line segment’s underlying
infinite line, but the transformation matrix H P is extended:
1 0 0 0
δψP 1 0 0
HP =
P
,
(4.36)
δρa 0 1 0
δρPb 0 0 1
with
and δψP defined above in Eq. (4.21). As a consequence of the above definitions
σα2 0 0 0
0 σρ2 0 0 T
PS = H P
HP .
(4.39)
0 0 σψ2 a 0
0 0 0 σψ2 b
70
One can also compute
σα2 0 0 0
0 σρ2 0 0
= H −1 PS (H −1 )T , (4.40)
P P
0 0 σψ2 a 0
0 0 0 σψ2 b
where
1 0 0 0
−δψP 1 0 0
HP−1 =
.
(4.41)
−δρPa 0 1 0
−δρPb 0 0 1
This transformation will determine the independent variances for all parameters in a line
segment covariance matrix.
This section considers how to extract subfeatures from the line segment. When comparing
pairs of features across scans for possible correspondence, it can be useful to isolate specific
aspects of the features to develop a partial feature comparison. Section 4.4 introduces a fea-
ture matching method that uses partial feature comparisons to achieve improved robustness
feature variation. The subfeatures considered are the underlying line in polar coordinates
and the endpoints in Cartesian coordinates. Methods to extract these subfeatures from
the description of a line segment S are introduced as well as the means of extracting the
covariances associated with these subfeatures.
Underlying Line
Given a line segment representation, S, and its covariance, P S , the underlying line, L, and
covariance, PL , can be extracted as follows. Let
L = HS (4.42)
and
PL = HPS H T , (4.43)
71
where
1 0 0 0
H= . (4.44)
0 1 0 0
Endpoint Coordinates
ub ψb
S ψa
yb ua
ρ
ya
α
xb xa
For a given segment S the Cartesian coordinates of the segment endpoints can be cal-
culated as follows:
xa
~ua = = Rα Ha S, (4.45)
ya
with
0 1 0 0
Ha = (4.46)
0 0 1 0
and
cos(α) − sin(α)
Rα = . (4.47)
sin(α) cos(α)
with
0 1 0 0
Hb = (4.50)
0 0 0 1
so
xb ρ cos(α) − ψb sin(α)
~ub = = . (4.51)
yb ρ sin(α) + ψb cos(α)
Endpoint Covariance
The covariance matrices associated with endpoints ~u a and ~ub can be defined as follows:
and
Pub = Rα Hb PS HbT (Rα )T , (4.53)
with Ha and Hb defined in Eqs. (4.46) and (4.50) and R α the transformation matrix defining
a rotation through α.
σψb
σρ σ ψa
ub
σα
ρ S
α ua
Fig. 4.10 shows a graphical representation of the endpoint uncertainty as ellipses defined
by a fixed confidence interval. The uncertainty bounds of the underlying line are shown as
73
σρ σψb
ub S
σ ψa
ρ
σρ
α
ua
a hyperbola with a dotted line. The bounds defined by the endpoint uncertainty are shown
as dashed lines. Fig. 4.11 shows a similar plot for a line segment feature with very small
value of σα .
This section outlines a process to extract line segment features from a set of range scan
points. I represent the raw range data set as introduced in Section 2.5.2, wherein U consists
of n range points with U = {~uk }, k = 1, ..., n where ~uk is the k th Cartesian point in the
range scan. An example of a simulated range scan is shown in Figure 4.12A. The goal of this
procedure is to detect m groups of points that are subsets of U (defined as {U Si }, i = 1, ..., m)
such that the points in each subset U S are collinear within a specific margin of error. From
each of these point subsets one can compute the optimal line segment parameters S and
the associated covariance matrix P S defined in Eqs. (4.25) and (4.31). The result is a set
of line segments {Si , PSi }, i = 1, ..., m, which can be used to represent the original point
data, as seen in Figure 4.12B. Note that m is not a predetermined value.
The next chapter extends this feature extraction process to allow for computationally
efficient extraction of features at multiple geometric scales. Here my primary focus is on
accurate feature extraction and noise modeling at a single scale. Extraction is an iterative
process consisting of a preliminary step and six primary steps that are repeated for each
extracted feature.
Step 1) Initial line guess: Given Ur , calculate the dominant infinite line L̂ = [α̂, ρ̂] using
74
A B
Robot pose
1000 mm 1000 mm
Figure 4.12: A) Raw range scan points. B) Extracted line segment features.
Step 2) Point grouping: Determine the subset of points U L where UL ⊂ Ur and the
subset consists of points along the line L̂ as determined by the Hough transform.
Step 3) Point noise modeling: Compute the set of point covariances P UL for the points
in UL from a range sensor noise model.
Step 4) Weighted line fitting: Calculate optimal line segment parameters S = [α, ρ, ψ a , ψb ]
given UL , PUL , and initial estimates from the Hough transform (α̂, ρ̂). For these calcula-
tions the contribution of each point is weighted according to its individual covariance. The
optimal fitting procedure is described below.
Step 6) Subsequent feature extraction: Determine the subset of points U S that lie
within the uncertainty bounds of the line segment S where U S ⊂ UL . Remove US from Ur
and, if points remain in Ur , go to step (1) to extract additional features.
75
Steps 1 through 5 define the extraction of a single line segment feature and step 6
removes the points used in the new feature from consideration and checks if subsequent
feature extraction is warranted. The following subsections address each of these steps in
more detail.
I use a Hough transform as the basis for the initial line guess. The Hough transform is a
pattern detection method that is effective at estimating the existence and position of lines in
noisy point data. It is a voting method that uses a discretized accumulator in a transformed
space called a Hough space. Each scan point votes in this space and a peak corresponds to
the line parameters that define the line with the most points in it. Section 2.4 introduces
the Hough transform in more detail.
Figure 4.13A shows the rasterized Hough space with accumulated votes from all range
scan points shown in Figure 4.12A. The darker colors represent Hough space cells with more
votes. The peak cell lies at coordinates (ρ̂, α̂) in Hough space as shown. Figure 4.13B shows
the infinite line defined as L̂ = [α̂, ρ̂] in Cartesian space corresponding to this peak. The
data sets in these figures have been simulated, with noise added to create a clear example
for reference.
As discussed in Section 2.4, the choice of discretization level of the Hough space can
have a significant effect on final point groupings and computational time. If the Hough
space cells are too small, only perfectly aligned points would accumulate votes. Because
of the imperfection in the data from sensor noise, even a scan of a perfectly straight wall
would spread votes out across many cells and could result in false peak detection. Also the
computational time to increment the Hough space is proportional to the number of cells in
the α dimension. If the Hough space cells are large, there is some computational benefit,
but the resolution of the line guesses decreases. In this approach, the discretization size in
ρ is defined as Dρ and is set to be equal to the 3σ value of the modeled point noise. The
discretization size in α is defined as the following function of D ρ :
Dρ
Dα = arctan , (4.54)
ρmax
where ρmax is the maximum sensing range. Dα is roughly the range of orientation for a
76
long line given a normal variation in end position of D ρ .
A B
0 114
1.5
1 0.5
α (rad)
0
α0
−0.5
ρ0
(ρ0, α0)
−1
1000 mm
Given the estimated infinite line L̂, the subset containing points that lie along the line can
be defined as UL . This set can be computed explicitly by first calculating the distance δ k
of each point in Ur to line L̂ = [α̂, ρ̂] as follows:
where uk = [xk , yk ] is the k th point in Ur . The points where δk is less than a distance
threshold can then be grouped into U L . A reasonable value for the threshold is D ρ , which
is equivalent to the Hough space discretization cell size in the ρ dimension. Alternatively,
some computational efficiency can be gained by maintaining a list of pointers at each cell
in Hough space of the points that contributed a vote to that cell. In this case U L is simply
determined to be the set of points associated with the peak cell in Hough space found
in Section 4.3.1. Figure 4.14A shows the selected points along the infinite line estimate
shown in Figure 4.13B. The output is a collinear subset containing s points defined as
77
UL = {uLk }, k = 1, ..., s.
A range point noise model was described in some detail in Section 2.5.2. The covariances for
each point in UL , are estimated according to Eq. (2.34), repeated again here for reference:
(dk )2 σθ2 2 sin2 θk − sin 2θk σd2 2 cos2 θk sin 2θk
P uk = + .
2 − sin 2θk 2 cos2 θk 2 sin 2θk 2 sin2 θk
The variables dk and θk represent the range and heading values for the scan points, and σ d2
and σθ2 are the variance terms in range and angle for the range sensing noise model. Figure
4.14A shows these covariance bounds for selected points. The dashed lines represent the 3σ
bounds on the line’s uncertainty, multiplied by 200 for visibility.
A B
1000 mm 1000 mm
The goal of the weighted line fitting method is to estimate the line segment S that best
fits the set of points UL where the k th point is individually weighted by the inverse of the
covariance matrices Puk . A rough initial guess of the line parameters of the underlying
78
infinite line L̂ = (α̂, ρ̂) is given by the Hough transform calculation. For derivations in this
chapter the subscript L will be dropped for convenience and I will refer to the set of points
to be fit as U = {uk }, k = 1, ..., j and the covariances for these points as P U = {Puk }, k =
1, ..., j.
As defined in Section 4.2.5, Eq. (4.25) the representation of a segment S that will be
extracted from the point data as follows:
α
ρ
S= ,
ψa
ψb
where α and ρ are the polar coordinates of the segment’s underlying line. I formulate the
weighted line fitting process using a maximum likelihood approach, which is used to refine
the underlying infinite line orientation α and normal position ρ. Due to the nonlinearities
introduced by the orientation estimation process, an iterative technique is used to implement
the weighted line fitting problem:
Step 1) Compute the distance, ρ, to the line.
Step 2) Compute the center of rotational uncertainty ψ P for the point set associated with
the line segment.
Step 3) Compute the line angle estimate, α, about ψ P .
Step 4) Compute endpoint parameters ψ a and ψb for S.
Step 5) Calculate convergence criterion and repeat from step 1 if the criterion isn’t met.
Consider the k th measured scan point with measured range dˆ and measured scan angle θ̂,
along with the current best guess of the parameters of the infinite line (α̂, ρ̂). To formulate
the maximum likelihood estimation of ρ I define the virtual measurement δρ k as the normal
distance between the k th point and the hypothetical line:
Recall that the measurement noise is assumed to arise from zero-mean Gaussian processes,
and that δρk is a function of zero-mean Gaussian random variables. Thus, L({δρ k }|L) takes
the form
Yn 1 T −1
e− 2 (δρk ) (Pδρk ) δρk e−M
L({δρk }|L) = p = , (4.58)
2π det P δρ k
D
k=1
n
1X
where M = (δρk )T (Pδρk )−1 δρk (4.59)
2
k=1
n
Y p
D = 2π det Pδρk . (4.60)
k=1
The optimal estimate of the displacement maximizes L({δρ k }|L̂) with respect to line repre-
sentation parameters ρ, and α. Note that maximizing Eq. (4.58) is equivalent to maximizing
the log-likelihood function:
and from the numerical point of view, it is often preferable to work with the log-likelihood
function. Using the log-likelihood formula, I prove in Appendix B.4 that the optimal esti-
mate of the radial position ρ can be computed as
80
Pn dˆk cos(α̂−θ̂k )
k=1 Pδρk
ρ= Pn 1 . (4.62)
k=1 Pδρ
k
Unlike the calculation of the line position ρ, the calculation of the line heading α depends
on the reference frame in which the calculation takes place. This is due to the lever arm,
which effects the virtual measurements from Eq. (4.56) when computing the best fit line
orientation. Because of this effect, it is beneficial to define a reference frame that minimizes
the effect of the combined lever arms. Note that the effect of the lever arm is due to distances
along the ψ axis, parallel to the line. Therefore only the ψ position of the reference frame
is important.
I define a value along the ψ axis of the current line estimate called ψ P , which refers to the
center of rotational uncertainty of the line. This term was first introduced in Section 4.2.4
when introducing aspects of infinite line rotational uncertainty. The value ψ P is calculated
such that about that value, the weighted lever arm effects from all virtual measurements
from each point are balanced. Start by considering the value of the virtual measurement
δρk in the presence of a small perturbation ω in the orientation of the underlying infinite
line about a point at ψP :
δρk = (ψ̂k − ψP ) sin(ω), (4.63)
where ψ̂k is the point position along the line calculated as follows:
Then compute the value of ψP that would minimize the set of weighted δρ k errors under
the perturbation ω to arrive at
Pn ψˆk
k=1 Pδρk
ψP = Pn 1 . (4.65)
k=1 Pδρ
k
There is not an exact closed form formula to estimate α. However, there are two efficient
approaches to this problem. First, the estimate of α can be found by numerically maximizing
Eq. (4.58) (or Eq. (4.61)) with respect to α for a constant ρ calculated according to Eq.
(4.62). This procedure reduces to numerical maximization over a single scalar variable α,
for which there are many efficient algorithms. Alternatively, one can develop the following
second order iterative solution to this nonlinear optimization problem:
The weighted line fitting estimate for the line’s orientation α is updated as α = α̂ + δα,
where Pn
δρk δψk
k=1 Pδρk
δα = − P , (4.67)
n (δψk )2
k=1 Pδρk
with δρk , Pδρk , and δψk defined in Eqs. (4.56), (4.57), and (4.66). Using experimental data,
this approximation agrees with the exact numerical solution.
Endpoint, ψa ψb Estimation
Once the parameters of the infinite line ρ and α have been updated, the relevant line segment
bounding points are defined by the contributing points with the maximum and minimum
values of ψ̂k as calculated from Eq. (4.64). In the line segment representation of S, ψ a is
set to be the minimum value of the set of points ψ̂k , and ψb is set to be the maximum value.
Convergence Criterion
If the adjustment to the angle estimate δα is greater than some threshold α , then conver-
gence has not been established. In that case, the current estimates for α and ρ are used as
initial input guesses to the next iteration. In practice, with a reasonable initial guess for α̂
and ρ̂ from the Hough transform, the algorithm converges to a steady estimate for S in just
one or two iterations. Figure 4.14B shows the estimated line segment S computed from the
weighted points in Figure 4.14A. The shown covariance of the segment is computed in the
following section.
82
4.3.5 Line Segment Covariance Estimation
The covariance of the estimated underlying infinite line L was defined in Eq. (4.6) and
takes the form
Pαα Pαρ
PL = . (4.68)
Pρα Pρρ
This matrix can be calculated from the set of weighted points as follows:
1
Pαα = Pn (4.69)
(δψk )2
k=1 Pδρk
1
Pρρ = Pn (4.70)
1
k=1 Pδρk
n
X
δψk
Pρα = −Pρρ Pαα , (4.71)
Pδρk
k=1
where δρk and Pδρk are defined in Eqs. (4.66) and (4.57), respectively. See Appendix C for
derivations of these equations. The full line segment S covariance P S augments PL with
endpoint uncertainty:
P 0 0
L
P S = 0 P ψa 0 , (4.72)
0 0 P ψb
where Pψa and Pψb are the components of the upper and lower bound point uncertainties
along the ψ axis.
The methods outlined above extract a single feature from a set of individually weighted
points. When a feature is extracted, the associated points are removed from consideration
and subsequent features are extracted from the remaining points. Figures 4.15A, B show
the Hough space accumulator and resulting infinite line estimate for the set of unused points
after the removal of the points from the first feature in 4.14A.
Figures 4.15A, B show the weighted points and the optimally fit subsequent line in this
example. These methods are repeated until there are no unused data points left in the scan.
The final line map extracted from this data set is shown in Figure 4.12B.
83
A B
0 84
1.5
1
ρ
0
0.5 (ρ , α )
0 0
α (rad)
α
0
0
−0.5
−1
1000 mm
A B
1000 mm 1000 mm
Grouped points
Point uncertainty bounds Extracted line segment
Prior lines Line uncertainty bounds
This section develops methods to determine whether two line segment features, detected
independently, represent the same aspect of the environment. This is a data association
84
problem that is critical to the effectiveness of any sensor-based localization and mapping
algorithm, as external sensors bring no new information to the state of the robot if cor-
respondences cannot be made between the data. My approach consists of a multi-step
hypothesis test that enables piecewise matching of the line segment features. Also, the
matching approach compensates for nonlinearities introduced by large uncertainties in line
orientation, which allows for accurate matching of short line segments.
Generally, localization methods treat the feature correspondence problem as a hypothesis
test with two possible outcomes.
Hypothesis: Feature A from pose i and feature B from pose j represent measurements of
the exact same feature in the environment
Negative Test Result: Rejection of the hypothesis and the features are considered inde-
pendent.
There are two classes of error that can be introduced by this process. A type I error
rejects the true hypothesis. A type II error, known as a false positive, is the error associated
with validating a false hypothesis. Unfortunately, improving robustness of a hypothesis test
to one type of errors can increase the chances of the other type, so care must be taken when
designing an effective and robust test. In a sensor-based procedure that is not limited to
very simple and structured environments, these existence of even a few of these errors can
be a significant limitation in the overall effectiveness of the procedure.
In most real-world localization and mapping applications, type II errors, or false posi-
tives, are more damaging than type I errors. In the case of a Kalman filter based estimator
introduced in Section 2.3, a false positive match results in the integration of unmodeled,
erroneous information into the estimator, which not only can corrupt the accuracy of the
overall state estimation, but also corrupts the covariance of the state estimation. The result
is a high confidence in a state with unmodeled errors, which can corrupt future estima-
tions. A type I error, in contrast, leaves out useful information in the estimator, but does
not corrupt the estimate, and the covariance of the estimate remains valid for subsequent
use. Also, the existence of other correspondences among alternate features at a single pose
85
can mitigate the effect of a missed feature correspondence due to redundant information
pertaining to the robot pose estimate. Therefore it can be considered preferable to have a
type I error over a type II error in a localization scheme.
While it is important to bias the test against accepting false positives, it can be very
difficult to establish a single threshold where the test sufficiently rejects type II errors
without allowing far too many type I errors. In such a case, almost no correspondences
are made and the estimator is useless. My approach is to develop a series of hypothesis
tests that allows for partial, piecewise matching of parameters in feature. I develop the
following multi-step hypothesis tests, which form the basis for the line segment feature
correspondence methods:
Hypothesis 1: Line segment Si from pose i and line segment Sj from pose j share the
same underlying infinite line.
Hypothesis 2: The line segments Si and Sj overlap and partially represent the same
portion of the environment. Hypothesis 1 must be true.
Hypothesis 3: One of the endpoints of line segments S i and Sj correspond to the same
point in the environment. Hypothesis 2 must be true.
Hypothesis 4: The other endpoint of line segments S i and Sj correspond to the same
point in the environment. Hypothesis 3 must be true.
Each hypothesis depends on the previous and each introduces an additional aspect of
potential correspondence. They also transition from comparing the most robust aspects
of the line segment feature to the least robust. I develop tests based on each of these
hypotheses in the following sections. The most important aspect of these hypothesis tests
is that they can enable partial correspondence, where only a portion of the full feature
information is validated in a match and subsequently used in a localization procedure. I
am therefore able to develop a set of very tight tests for each individual hypothesis, which
effectively rejects false positives, yet the cascading tests allow for possible partial matches
that would be otherwise rejected. This helps helps to mitigate the effect of type I errors
while keeping a very low threshold for accepting type II errors.
The following tests assume two line segment features S i and Sj defined in Section 4.2.5
86
and reproduced here as follows:
αi αj
ρi ρj
Si =
i
Sj =
j .
(4.73)
ψa ψa
ψbi ψbj
The tests also assume that the full covariance matrices P Si and PSj for these line segments
(defined in Eq. (4.31)) have been calculated. Further, all parameters must be represented
with respect to a common reference frame. Section 4.3 outlined a method of extracting
these features and covariances, and Section 4.2.7 outlined methods of transforming them
across frames.
Hypothesis 1 deals with the most robust parameters of the line segment, the orientation
α and normal position ρ of the underlying line L. These parameters usually depend on
many range scan points, as described in Section 4.3.4. This dependence reduces the effect
of sensor noise. Also, more importantly, the parameters of the underlying line are more
robust to changes due to occlusion than the endpoint measurements. A chi-square test,
introduced in Section 2.1, is used as the basis for the hypothesis test. This test developed
here represents an improvement over previous work in that it doesn’t impose a small-angle
requirement for line orientation uncertainty.
Given two line segments Si and Sj , first extract the underlying lines L i and Lj from each
segment as outlined in Section 4.2.9. Similarly, extract the covariance of the underlying
lines PLi and PLj from the full covariance of the segments P Si and PSj , respectively, using
Eq. (4.43). Then combine the covariance matrices to calculate the total relative uncertainty
of the difference between the two features:
This comparison test is sufficient for a pair of line segment features that both have a
very small uncertainty in orientation. When this requirement does not hold, it is important
to consider the effect of the nonlinearities involved with coupled orientation and position
uncertainties. Depending on the reference frame, a small perturbation in the difference in
orientation between lines can result in a large value of δρ due to a lever arm effect. It is
common practice in prior line-based mapping methods to use a small-angle approximation
in orientation uncertainty, which allows this cross-coupling to be adequately approximated
by a linear process. This has meant that only longer lines with sufficiently small orientation
uncertainty could be used as features.
In order to effectively use line segments with higher rotational uncertainty, the nonlin-
earities introduced by this cross-coupling must be reduced. This can be done by comparing
the features represented with respect to a reference frame from which the cross-coupling is
minimized. The position of this reference frame is defined to be at the center of rotational
uncertainty for the combined covariance matrix P δL defined in Section 4.2.4. I define this
point as ~uPδL , and calculate its location using Eqs. (4.21) and (4.24). The line parame-
ters and covariances are transformed into a frame centered at this point using equations
from Section 4.2.7. Lines Li and Lj are transformed to L̃i and L̃j , and covariance PδL is
transformed to P̃δL . The final chi-square calculation is therefore
δ α̃ α̃i − α̃j
δ L̃ = L̃i − L̃j = = , (4.77)
δ ρ̃ ρ̃i − ρ̃j
I then compare this calculated value of D̃ with a threshold set using the probabilities
88
of a chi-square distribution with two degrees of freedom (for the two-dimensional infinite
line), which I define as χ2 . The following is a criterion to reject the hypothesis:
D̃ 2 > χ2 . (4.79)
If the D̃ value is less than the threshold then it can be determined within the chosen
probability that the difference between line L i and line Lj can be explained by the modeled
noise and the hypothesis is not rejected. Otherwise the hypothesis is rejected outright and
no subsequent testing is done for the feature pair.
The second hypothesis test measures the overlap of the two line segments that were deter-
mined above to be collinear. Given the segments S i and Sj , it is first necessary to select the
segment with the least uncertain orientation estimate to use as the base segment. In this
section, it is assumed that Si is the base segment, which in practice is most often the longer
segment. The endpoints are then extracted from S j according to the equations introduced
in Section 4.2.9 and referred to as ~u ja = [xja , yaj ] and ~ujb = [xjb , ybj ]. These points are then
projected points into the underlying infinite line associated with S i with orientation αi as
follows:
where ψ̃aj and ψ̃bj are the lower and upper scalar end measurements, respectively, of line
segment Sj transformed to the frame of line segment S i . One can then compare these with
the lower and upper scalar end measurements of line segment S i which are ψai and ψbi ,
respectively. The uncertainty measure for each of these coordinates is also considered in
this comparison. According to the structure of the line segment covariance matrices P Si and
PSj as shown in Eq. (4.31), the relevant variance terms are extracted and defined as P ψi a ψa
and Pψi b ψb directly from line segment i. The values of P ψj a ψa and Pψj b ψb for line segment
j are extracted from a projection of the endpoint covariance calculated from Eqs. (4.52)
89
and (4.53) projected into the base line segment S i . This projection method of endpoint
comparison is valid for line orientation differences that are too large to apply the small-
angle approximation used in prior work.
Because this test considers the overlap of features that have some non-zero length in
the ψ dimension, there is an interval of relative positions where the distance or cost should
be zero. I therefore develop a piecewise Mahalanobis distance metric in the ψ dimension
that is then used as the basis for an overlap chi-square test. The center values of ψ ci and ψcj
along the line segments can be computed as
ψai + ψbi
ψci = , (4.82)
2
j ψaj + ψbj
ψc = , (4.83)
2
∆ij
ψ is defined as the maximum distance between the center values that would result in any
`i + ` j
∆ij
ψ = . (4.86)
2
D 2 = 0, (4.87)
2
(ψci − ψcj + ∆ij
ψ)
2
D = . (4.89)
Pψi b ψb + Pψj a ψa
This is a smooth piecewise function and as the lengths of the line segments ` i and `j
approach zero, the distance function approaches the standard metric for distance between
point values. I then compare D 2 with a threshold χ2 set from a chi-square distribution with
a single degree of freedom. Therefore if D 2 > χ2 the hypothesis can be safely invalidated
and one can conclude that the features do not overlap. If the test fails, the subsequent
hypotheses tests are not applied for this feature pair.
This test compares the values of the line segment endpoints to determine if there are zero,
one, or two matches. Here again the projected endpoint values are computed for line segment
Sj and compared with the endpoint values of S i using a chi-square test. The following two
chi-square tests are considered:
(ψai − ψ̃aj )2
Da2 = , (4.90)
Pψi a ψa + Pψj a ψa
(ψbi − ψ̃bj )2
Db2 = . (4.91)
Pψi b ψb + Pψj b ψb
Each of these values is compared to a threshold determined by the desired probability limit
χ2 for a single degree of freedom system. Hypothesis 3, which assumes at least one end
matches, is supported if
Da2 ≤ χ2 or Db2 ≤ χ2 . (4.92)
Recall that each of these tests requires that the previous hypothesis was validated, so the
test of Hypothesis 4 is akin to a full feature comparison in that the hypothesis is true if all
parameters of the feature match. The separation of the line segment feature comparison
91
method into this tiered comparison approach allows for partial matching of line segment
features. The feature merging and localization methods developed below take advantage of
these partial matches.
This section describes how to merge line segments found in the same scan, or across scans
taken at distinct poses. This merging allows compression and simplification of large maps
without sacrificing the precision or the knowledge of map uncertainty that is gained from
the weighted line fitting algorithm. The process of merging lines across two pose data sets
is considered in detail. Merging across multiple data sets is a natural extension. The basic
approach is simple. The candidate line pairs are first transformed into a common refer-
ence frame, and the hypothesis tests developed above are applied. A maximum likelihood
approach is then used to determine the best estimate of the line pairs to be merged. The
merging process itself depends on the results of the partial match hypothesis tests developed
in Section 4.4.
Full Merge
If the pair of line segments Si and Sj pass all of the hypothesis tests outlined above, all of
the coordinates of the features are merged. The final merged line, S m , can be estimated
using a maximum likelihood formulation with the necessary condition for most likely line
Sm as follows:
∂M
= 0, (4.94)
∂Sm
N
X
M (Sm ) = (Sk − Sm )T (PSk )−1 (Sk − Sm ). (4.95)
k=1
92
In at the minima of Eq. (4.95), the following holds
∂M (Sm )
= 0⇔
∂(Sm )
N
X
(PSk )−1 (Sk − Sm ) = 0 ⇔
k=1
N N
!
X X
(PSk )−1 (Sk ) = (PSk )−1 Sm ⇔
k=1 k=1
N
!−1 N
X X
−1
Sm = (PSk ) (PSk )−1 (Sk )
k=1 k=1
i
Sm = PSi m (PSi )−1 Si + (PSj )−1 Sj (4.96)
−1
PSi m = (PSi )−1 + (PSj )−1 . (4.97)
If the pair passes Hypothesis Tests 1 and 2 outlined above in Sections 4.4.1 and 4.4.2, then it
can be assumed that the two line segments share a common underlying infinite line and have
at least some overlap in the ψ dimension along the lines. If the pair fails Hypothesis Tests 3
and 4 such that neither endpoints overlap, then only the underlying infinite line is merged.
The underlying infinite lines, Li and Lj , are extracted from each segment using Eq. (4.42).
The covariance matrices of the underlying lines, P Li and PLj , are extracted from the full
covariance of the segments PSi and PSj , respectively, using Eq. (4.43). The corresponding
maximum likelihood calculation for these terms results in the following merged infinite line
Lm and covariance PLm :
Lim = PLi m (PLi )−1 Li + (PLj )−1 Lj , (4.98)
−1
PLi m = (PLi )−1 + (PLj )−1 . (4.99)
The merged end pairs, ψam and ψbm , can then be selected based on the values of the
segment endpoints, which would result in the effective union of the two segments. So
Similarly the final covariance matrix P Sm is arrived at by augmenting the merged infinite
line covariance matrix PLm with the rows and columns of PSi and PSj corresponding to the
selected endpoints taken in Eq. (4.100).
In some cases it can be useful to merge line segments that have no overlap, as a pair of
segments sharing a common wall, but separated by a doorway. In this case both endpoint
pair values are simply appended to the line segment representation as introduced in Eq.
(4.26).
Consider the case where the line segments pass all hypothesis tests but test 4 outlined
in Section 4.4.3 such that only one of the two endpoints are determined to match. The
values of Si ,Sj ,PSi and PSj corresponding to the unmatched endpoint are removed and the
optimal estimates for the remaining coordinates are calculated using the above Eqs. (4.96)
and (4.97). The unmatched end values are then reinserted according to Eq. (4.100) to
reconstruct Sm and PSm .
These merging methods were implemented on a Nomadics 200 mobile robot equipped with
a Sick LMS-200 laser range scanner. In the experiments, the sensor noise values of σ d = 5
mm, σφ = 10−4 radians were used, as obtained from the Sick LMS-200 laser specifications.
The value of the χ2 threshold was set to merge lines within the 3σ deviance threshold.
Figs. 4.17, 4.18, 4.19 show a sequence of increasingly complex data sets that were
gathered in the hallway outside of a laboratory. Fig. 4.17 graphically depicts the results
of fitting lines to a single scan taken in the hallway. The left figure shows the raw range
data along with the 3σ confidence region of selected points as calculated from the sensor
noise model. The right figure shows the fit lines along with the 3σ confidence region in ρ.
94
Range Scan Points Fit Lines
Robot Poses 50 x Covariance (3 σ )
Line Endpoints
A B
5
−1
−2
−3
−1 0 1 2 3 4 −1 0 1 2 3 4
(m) (m)
Figure 4.17: Range data: A) Raw points and selected point covariances. B) Fit lines and
line uncertainties.
All uncertainty values have been multiplied by 50 for clarity. From the 720 raw range data
points the algorithm fit 9 lines. If it is assumed that a line segment can be represented
by the equivalent of two data points, this method have effectively compressed the data by
97.% over a point-based map. This compression not only reduces map storage space, but
it can also serve to reduce the complexity of any relevant algorithm (e.g., scan matching)
that scales to the order of number of features. Unlike other feature finders such as corner
detectors, the lines abbreviate a large portion of the data set, so overall far less information
95
A B C
−1
−2
−3
−4
−1 0 1 −1 0 1 −1 0 1
(m) (m) (m)
Figure 4.18: Range data from two poses: A) Raw points and selected point covariances. B)
Fit lines and line covariances. C) Merged lines and line covariances.
is lost in compression.
Merging lines across scans further improves compression of data. Fig. 4.18 graphically
depicts the results of fitting lines to scans taken at two poses in a hallway. The left figure
shows the raw range data, the center figure shows the lines fit to the two scans, and the right
figure shows the resulting merged lines. From the 1440 raw range data points the algorithm
fits 20 lines without merging, and 14 lines after merging. The merging step compresses the
data a further 30% for a total compression of 98.0% from the original data. Note that the
three vertical segments on the right are found to be collinear and are “merged” even though
they do not overlap.
Compression achieved by line fitting and merging is equally pronounced in large data
sets. Fig. 4.19 depicts the results of fitting lines to scans taken at eight poses in the hallway.
96
A B C
−2
−4
−6
−8
−10
−14
−16
−1 0 1 2 −1 0 1 2 −1 0 1 2
Figure 4.19: Range data from eight poses – A) Raw points and selected point covariances.
B) Fit lines and line covariances. C) Merged lines and line covariances.
As above, the left figure shows the raw range data, the center figure shows the lines fit to
the ten scans, and the right figure shows the resulting merged lines. From the 5760 raw
range data points the algorithm fits 93 lines without merging and 29 lines after merging.
The merging step here compresses the data a further 68% for a total compression of 98.9%
from the original data. Note that many of the jogs in the lower portion of the hallway arise
from recessed doorways, water fountains, and other features. Note also how the method
effectively merges the broken line defined by the right wall of the hallway.
Clearly the level of compression depends upon the environment. Hallways will likely
have very high compression due to long walls that can be merged over many scans. In
more cluttered environments, the compression may not be as high, but it can still be very
effective. Fig. 4.20 shows the results of fitting lines to range scans taken at ten poses in
our laboratory. Fig. 4.20A shows the raw scan points, Fig. 4.20B shows the fitted lines,
97
A
−2 0 2 4 6 8 10 12
(m)
−2 0 2 4 6 8 10 12
(m)
−2 0 2 4 6 8 10 12
(m)
Figure 4.20: A) Raw points and selected point covariances. B) Fit lines and line covariances.
C) Merged lines and line covariances.
98
and Fig. 4.20C shows the resulting merged lines. From the 7200 raw range data points,
the algorithm fit 141 lines without merging, and 60 lines with merging. The merging step
compresses the data a further 57% for a total compression of 98.9% from the original data.
99
4.6 Line Segment–Based Kalman Filter
This section introduces an extended Kalman filter based SLAM (simultaneous localization
and mapping) algorithm that uses line segment features as the primary representation of the
environment. Some general background information on the extended Kalman filter (EKF)
was introduced in Section 2.3. This method not only merges line segments together, but
also computes an updated estimate for the robot’s position as the map is constructed. In
this section, the equations for the extended Kalman filter itself are developed. This includes
methods to use the cascading hypothesis tests outlined in Section 4.4 to selectively update
the filter based on the type of feature correspondence established.
The EKF state vector X at time step k takes the following form:
x
y
φ
Xk =
,
(4.102)
S1
...
Sn
k
where [x, y, φ] represents the robot’s pose and S 1 ...Sn the line segment features. All variables
are represented with respect to a common global reference frame and the representation of
the line segment feature is shown in Eq. (4.25). Note that the length of the state vector is
m = 4n + 3. The covariance for the EKF is defined as P Xk and is represented as an m × m
matrix that maintains all cross-correlations of each feature and the robot’s pose.
The control input for mobile robot propagation can be represented as the change in pose
uk−1 = [dx, dy, dφ] with process noise Pu , both represented in a local reference frame with
respect to the robot’s heading at step k − 1. This is generally the form of the odometry
measurement for mobile robots and is introduced in Section 2.5.1. The transformation of
the state X thorough this dead reckoning update u k−1 with measurement perturbations
100
wk−1 = [x , y , φ ] can be calculated as follows:
The propagation of the robot state due to robot motion and dead reckoning updates can
be described by the following equation:
In the process of mapping, consider a sensed line segment feature in the environment, S̄,
measured in the local coordinate frame. If this feature is confirmed to correspond with
the cth existing feature denoted as Sc(k) , then the following process is used to incorporate
that information to update the robot state X and covariance P X . Multiple variations of
101
the update equations are presented that depend on the results of the cascaded hypothesis
tests introduced in Section 4.4. Similar to the line merging approach in Section 4.5, partial
correspondences of features is allowed, and the filter is only updated with the information
given by the parameters verified to match.
The following nonlinear function h k () represents the coordinates of measurement S̄ in
terms of the current state X and a set of measurement perturbations v k = [α , ρ , ψa , ψb ]:
S̄ = hk (Xk , vk )
αl αc(k) + α − φk
ρl ρc(k) + ρ − xk cos(−αc(k) ) + yk sin(−αc(k) )
= . (4.108)
a a
ψl ψc(k) + ψa − xk sin(−αc(k) ) − yk cos(−αc(k) ) + α ρl
ψlb b
ψc(k) + ψb − xk sin(−αc(k) ) − yk cos(−αc(k) ) + α ρl
The general update equations for the extended Kalman filter are as follows:
−1
Kk = Pk|k−1 HkT Hk Pk|k−1 HkT + Vk PS̄ VkT , (4.109)
Pk = (I − Kk Hk )Pk|k−1 , (4.111)
where Hk and Vk are the Jacobian matrices of partial derivatives of h k () with respect to Xk
and vk calculated at each step k. It is these matrices, H k and Vk that change depending on
which parameters of the line segment have been determined to correspond.
First consider the case where the feature pair has passed a test for hypothesis 4 outlined in
Section 4.4.3 and therefore all parameters of the line segment (the underlying infinite line
and both endpoints) correspond between the sensed feature S̄ and the stored feature Sc .
For simplicity the subscript k, which denotes the k th time step of the filter, will be dropped
for this derivation. The Jacobian matrix of partial derivatives of h() with respect to X can
be computed in two parts as follows:
102
0 0 1
− cos(αc ) − sin(αc ) 0
Hg =
,
(4.112)
sin(αc ) − cos(αc ) 0
sin(αc ) − cos(αc ) 0
1 0 0 0
x sin(αc ) − y cos(αc ) 1 0 0
HSc =
,
(4.113)
x cos(αc ) + y sin(αc ) 0 1 0
x cos(αc ) + y sin(αc ) 0 0 1
with Hg relating to the derivatives of h() with respect to the pose portion of the state, and
HSc relating to the derivatives with respect to the c th feature of the state. The complete
H matrix can therefore be assembled as
h i
H = Hg 0 ... HSc ... 0 , (4.114)
where the position of HSc in the matrix corresponds to the position of S c in X. The matrix
V can similarly be calculated
1 0 0 0
0 1 0 0
V = , (4.115)
ρl 0 1 0
ρl 0 0 1
where
ρl = ρc(k) − xk cos(−αc(k) ) + yk sin(−αc(k) ). (4.116)
For a full feature match, these values of H and V are applied to Eqs. (4.110) and (4.111)
to calculate the updated state and covariance.
In the case where a subset of the feature coordinates have been determined to correspond
in a feature pair, the H and V matrices are adjusted by setting the row corresponding to
103
the unmatched coordinate to be all zeros. For example, in the common case when just the
underlying infinite line (α and ρ) is merged, H and V can be calculated as follows:
0 0 1
− cos(αc ) − sin(αc ) 0
Hg =
,
(4.117)
0 0 0
0 0 0
1 0 0 0
x sin(αc ) − y cos(αc ) 1 0 0
HSc =
,
(4.118)
0 0 0 0
0 0 0 0
where
h i
H = Hg 0 ... HSc ... 0
and
1 0 0 0
0 1 0 0
V =
.
(4.119)
0 0 0 0
0 0 0 0
Similar adjustments are made in the case where only one end matches.
Endpoint Update
In the case of partial feature update where the endpoints do not correspond, it is beneficial
to extend the stored line feature given the new measurement. After merging the underlying
infinite line, either or both endpoints in the state vector can be effectively replaced with
the measured endpoint values if the new measured values extend the overall length of the
line segment. When replacing these parameters, the covariance matrix must be adjusted to
reflect the uncertainly of this new measurement as well.
My approach to line segment–based SLAM utilizes the methods developed above. Given an
initial state X and PX consisting only of the robot pose and pose covariance, the mapping
process can be described in the following set of steps:
Step 1) At the current pose a range scan is taken and a set of features is extracted using
methods from Section 4.3.
Step 2) Each extracted feature is compared with the stored features in the Kalman filter
state X using the methods in Section 4.4.
Step 3) Features that are shown to correspond are updated according to the process out-
lined in Section 4.6.3.
Step 4) The set of sensed features that are isolated are added to the state as shown in
Section 4.6.3.
Step 5) Propagate the state given a robot displacement g according to the methods in
Section 4.6.2. Go to Step 1.
Using this process, maps can effectively be built that accurately and efficiently describe
the environment while localizing the robot.
In this section I present results using the line segment–based SLAM algorithm. As in the
previous experiments shown, all data sets were collected with a Sick LMS-200 laser range
scanner and using the sensor noise model developed in Section 2.5.2. Figure 4.21B shows
the map built with the same data presented in the previous chapter in Figure 3.8 and again
represented in raw form in Figure 4.21A. The weighted scan matching method developed
105
in Chapter 3 estimates relative displacement between poses. The SLAM method presented
here estimates the global position of the robot. The benefits of this method can be seen
in the accuracy in the final robot localization after traveling the 83-pose, 24.2-meter path.
Note that the final data set in the path is taken from the exact spot that the initial data
set was taken, so the final displacement estimate should be exactly zero. The estimated
displacement for the final position is 4.5 mm while the estimated orientation is 3.3 × 10 −5
radians. This shows better performance than the weighted scan matching estimates of 18
mm and 0.013 radians, but the primary benefit of the SLAM based method is that the error
will remain bounded over multiple loops unlike the displacement estimation method.
In this line segment feature based mapping method, no range data points are unrepre-
sented. Every point from the original 83 scans was fit to a line, even if that “line” was fit
to an isolated point and had zero length. The resulting data compression for this map rep-
resentation is still significant, and of the original 29,880 scan points, the final map contains
251 line segments for a total compression of over 98%.
This ability to represent lines of any length with the same feature and still merge and
localize with all of the data is a significant improvement over prior work in line feature
based localization and mapping. The methods of Castellanos and Tardos [CT99, CMNT99]
are limited in their ability to accurately compare and merge small lines as they depend on
the small-angle assumption in line orientation uncertainty. Figure 4.22 shows a map built
with a SLAM method that enforces a lower bound on the allowable feature size as would be
required in prior work. In this environment an effective map is still built and the accuracy is
comparable at the final position. In fact, in many cases it is beneficial to ignore the smaller
features, as often only the dominant lines are needed to localize the robot. My method
certainly allows this further pruning of the line map if desired, but there are cases where
the forced dropping of data in the Castellanos and Tardos method can hurt the localization
process.
For example, Figure 4.23A shows a series of hallway scans where the only discernible
feature besides the side walls is the round trash can on the left. Figure 4.23B shows the
results of localization using my approach where the round can is adequately represented and
used to correct the position of the robot in the direction along the hallway. Figure 4.23C
shows the representation and localization results for the feature representation of Castellanos
and Tardos. The can is lost and the process is unable to correct the position of the robot
106
1000 mm
Robot Poses
Range Point Data
1000 mm
Robot Poses
Line Segments
Figure 4.21: Line map built with presented SLAM techniques: A) Full raw point data. B)
Full line segment map representation.
along the hallway. In these experiments the odometry estimates were perturbed by 50 mm
at each step for both methods to accentuate the differences in localization accuracy. The
error in the final position using my method was 6.5 mm while the error for the Castellanos
and Tardos method is 192.2 mm.
107
1000 mm
Robot Poses
Line Segments
Figure 4.22: Full line map built with line segment length restriction.
A B C
Figure 4.23: Hallway data set: A) Raw data points. B) Full set of fit lines. C) Restricted
set of fit lines.
Computational Cost
All algorithms are implemented in Matlab and therefore not optimized for speed or efficiency.
Still, even in the best case, the computation cost for extracting these line segment features
108
and performing SLAM is not insignificant. For the data set shown in Figure 4.21, operating
on a 2.0 Ghz Athlon PC with 1 GB RAM, the average computation time for complete line
extraction from data taken at a single pose was 0.83 seconds. The average time to match
and merge the line data was 5.2 seconds per pose where the computation time was heaviest
at the later poses when the map was large. These numbers are not ideal for some real time
applications, but they are certainly adequate for laptop based mobile robot applications.
A significant speed gain can be expected with the transition of the algorithms to C++ as
most computation time is spent inside loops that Matlab is unable to effectively optimize.
Chapter 5 offers further benefits to computational efficiency of these methods by introducing
a multi-scale approach.
This chapter introduced novel approaches to line feature representation, extraction, and
data association. These approaches represent significant improvements over existing fea-
ture based mapping and localization methods. The use of features introduces improvements
over point based methods discussed in the previous chapter. The choice of line segment
representation facilitates comparison with the underlying infinite line, which is the most
robust portion of the feature. The extraction method applies a weighted line fitting algo-
rithm that takes into account the individual uncertainty of each scan point when computing
the optimal line fit and covariance. The matching methods introduce a tiered comparison
test that enables the partial matching and merging of features for finer grained selection of
feature correspondence. Also these methods compensate for the nonlinearities associated
with polar line segments to enable the matching and merging of short segments with high
orientation uncertainty. While these methods display significant improvements in quality
and accuracy, the computational cost for correspondence and mapping using these features
can be unfeasible for some high-speed or low-complexity applications. The following chapter
builds on the line segment–based approach to develop a multi-scale representation, which
offers significant benefits to computational cost and complexity while maintaining accuracy.
109
Chapter 5
This chapter introduces multi-scale feature extraction and data correspondence methods
that have significant computational benefits when applied to a number of problems related
to localization and mapping. In Chapter 4, a method was introduced to fit lines to the
range data points in order to reduce the data complexity. In this chapter a multi-scale
version of the line segment feature is presented that represents range data as a line segment
with a non-zero width that are defined as block features. The notion of the feature scale is
associated with the block width, with the block width increasing as scale becomes coarser
as shown in Figure 5.1. The block features can encompass very short segments or even
single points in the case of unstructured data. The interrelations between features at each
scale are represented as a tree structure (Fig. 5.1A, where the red nodes and branches in
the tree correspond to the red features in the Figs. 5.1B–E).
This multi-scale tree based feature representation can be used to improve any combi-
natorial comparison or search that arises in localization and mapping methods. The most
common example is in feature correspondence. A coarse-to-fine data traversal can simplify
the computations required to determine if pairs of features, and if entire scans are a match.
To properly implement such a matching method, one needs a corresponding multi-scale
chi-square test, which is provided in this chapter.
Another example of a problem that can benefit from the multi-scale representation is the
“kidnapped robot problem” [Eng94]. This problem describes a situation where the robot
110
Multi−scale graph connections
Feature nodes
Selected feature nodes
200
Scale (mm)
100
25 50
Scale = 200 mm Scale = 100 mm
Robot pose
1000 mm 1000 mm
Data points
Feature bounds Feature bounds
Selected feature bounds Selected feature bounds
Extracted bounds uncertainty Extracted bounds uncertainty
Scale = 50 mm Scale = 25 mm
1000 mm 1000 mm
Figure 5.1: Multi-scale range scan representation: A) scale tree graph. B–E) A sequence of
increasingly fine scale representations of the data.
is blindly moved and needs to relocalize in an environment. The solution to the kidnapped
robot problem invariably involves matches between a sensor scan from the robot’s current
configuration with a database of geometric data to find the most likely robot configuration.
The coarse scale representation allows a very efficient search of the database. Many possible
mismatches can be eliminated at the coarse scales, at which the computational complexity
of the matching procedure is small.
111
Relation to prior work. It has long been recognized that the computational com-
plexity of mapping and localization can scale prohibitively with the number of landmarks
encountered and stored by the robot during its mapping and localization processes. A num-
ber of authors have focused on different techniques to reduce the computational complexity
associated with various aspects of mapping and localization. For example, sparsification of
the information matrix can significantly increase the efficiency of Kalman filter based SLAM
methods [TLK+ 04, WEL04]. Rao Blackwellization can increase the efficiency of particle fil-
tering based SLAM algorithms (e.g., the FastSLAM algorithm[MTKW02]). Finally, there
are feature based SLAM methods that attempt to reduce computational requirements by
reducing the feature set [NBL03]. The approach presented in this chapter uses multi-scale
feature extraction and multi-scale correspondence to reduce computational complexity, and
is most similar to the feature set reduction method.
Prior work in this area touches on some aspects of multi-scale data processing for lo-
calization, mapping, and navigation. Madhavan et al. [MDWD02] apply the classical scale-
space approach of Gaussian smoothing to range data. A number of authors use multi-scale
methods for efficient environment representation or planning [PR98, TMK04, MT04]. How-
ever, none of these works attempt multi-scale feature extraction and correspondence.
The vision community has a far richer collection of scale-space approaches. Some meth-
ods extract point-like features at multiple scales for the purpose of object matching [KB01,
Low99], and other work explores multi-scale edge detection and filtering [PM90, WRV98].
These applications fall in the same family of algorithms as my work; but with different
sensors and application goals, my implementation and focus is somewhat different.
My work uses a multi-scale Hough transform to efficiently extract multi-scale line seg-
ment features from planar range data. In the field of computer vision there are prior efforts
to develop a multi-scale Hough transform [MO01, MO00, OM99], though my particular
version of the multi-scale Hough transform is unique. The motivation for using a multi-
scale approach in much of the prior work in the computer vision field is to increase the
efficiency of the feature extraction process [PYIK89]. Similar efficiency benefits arise in my
feature extraction methods but the primary contributions of my work are the algorithms
that utilize the multi-scale feature representation. Using the multi-scale features, I intro-
duce algorithms for multi-scale feature correspondence, displacement estimation, and the
kidnapped robot problem, that take advantage of the multi-scale data structure to improve
112
robustness and reduce computational complexity.
Chapter Overview
This chapter is structured as follows: Section 5.2 introduces the representation for a “block
feature,” which builds on the line segment representation developed in the previous chapter
by adding a notion of scale. Section 5.3 describes the multi-scale Hough transform process
for extracting features at each scale. Section 5.4 describes methods of matching the block
features. Section 5.5 reviews the scale tree dendrogram that organizes the multi-scale fea-
tures extracted from the Hough transform. Section 5.6 describes methods to use the scale
tree to enable efficient feature correspondence at the finest scales. Section 5.7 describes a
localization example that benefits from the scale-based approach. Section 5.8 outlines a
block based Kalman filter and SLAM method. Section 5.9 outlines the multi-scale solution
to the kidnapped robot problem and compares the computational cost with a single scale
solution.
This section introduces a block feature representation that forms the core of a multi-scale
feature based localization and mapping approach. The block feature extends the notion of
a line segment (Chapter 4) to a multi-scale setting and allows for flexibility in representing
complex sets of point data. Just as the line segment feature introduced in the previous
chapter extends a point feature into a length dimension, the block feature extends a line
segment feature into the width dimension.
The motivation for developing the block feature is to enable for more flexibility in
representing range data sets. The width element of a block allows for abstraction of groups
of somewhat aligned points that stray from collinearity due to environment geometry, which,
unlike sensor noise, is not adequately modelable a priori. While I do assume a Gaussian
distribution on the uncertainty of the position of the block boundaries, no assumptions
or abstractions are made on the distribution of the set of points represented inside the
block. One can therefore develop algorithms using the block feature that are less sensitive
to unmodeled effects from the scanning process and geometry and still representative of the
underlying environment. This approach enables adequate representation of an environment
113
at a coarser scale with wider block features, while still maintaining the ability to represent
fine, straight line features at the finest scales as the block width approaches zero.
This feature representation also allows for the effective abstraction of groups of points
that do not form a discernible line. A block feature with a length-to-width ratio at or near
a value of 1 can be used to represent blob-like groupings of points though, the orientation
element of such a block loses relevance in localization procedures.
ψb
ψa
ρa
ρb B
α
Let the covariance of block B be denoted by P B . It extends the covariance definition for
a line segment, PS , from Eq. (4.31) by adding a row and column corresponding to the
114
additional ρb term. The full covariance matrix PB can be written as follows:
Pαα Pαρa Pαρb Pαψa Pαψb
P ρa α P ρa ρa P ρa ρb P ρa ψ a P ρa ψ b
P B = P ρb α P ρb ρa P ρb ρb P ρb ψ a P ρb ψ b . (5.2)
P ψ a α P ψ a ρa P ψ a ρb P ψa ψa P ψa ψb
P ψ b α P ψ b ρa P ψ b ρb P ψb ψa P ψb ψb
The uncertainty of ρa and the uncertainty of ρb need not be the same. See Fig. 5.3 for an
example of block covariance where the uncertainty of ρ b is greater than the uncertainty of
ρa .
σ ψb
B σ ρb
σα
σ ρa
σ ψa
σα
T
PSa = HSa PB HSa , (5.3)
T
PSb = HSb PB HSb , (5.4)
T
PSc = HSc PB HSc , (5.5)
with HSa , HSb , and HSc defined above in Eqs. (5.24),(5.25), and (5.27).
115
5.2.3 Block Frame Transformations
The frame transformations introduced in Section 4.2.7 for line segments can be naturally
extended for the block feature representation.
Coordinate Transformation
Consider a block Bi measured in a local frame with respect to pose g i = [xi , yi , φi ]. The
block coordinates are transformed to a global frame representation B 0 , as follows:
α0 αi + φ i
0 i
ρa ρa + δρi
0 i
B0 = ρb = ρb + δρi , (5.6)
0 i
ψa ψa + δψi
ψa0 i
ψb + δψi
where δρi and δψi are the coordinates of the displacement g i projected into the 00 α − ρ00
frame defined as
δψi = yi cos(αi + ψi ) − xi sin(αi + ψi ), (5.7)
Covariance Transformations
Consider a covariance matrix, PBi , of block Bi measured with respect to an uncertain pose,
gi , whose pose covariance matrix is Pgi . The matrix PBi can be transformed to the global
frame at pose i as follows:
1 0 0 0 0
δψi 1 0 0 0
H Bi = δψi 0 1 0 0 , (5.10)
−δρi 0 0 1 0
−δρi 0 0 0 1
where δψi and δψi are defined in Eqs. (5.8) and (5.7). Using the same approach outlined
in Section 4.2.7, define KBi to be
0 0 1
1 0 0 cos(−αi + φi ) −sin(−αi + φi ) 0
K Bi = 1 0 0 sin(−αi + φi ) cos(−αi + φi ) 0 (5.11)
0 1 0 0 0 1
0 1 0
0 0 1
cos(αi + φi ) sin(αi + φi ) 0
= cos(αi + φi ) sin(αi + φi ) 0 . (5.12)
−sin(αi + φi ) cos(αi + φi ) 0
−sin(αi + φi ) cos(αi + φi ) 0
The notion of the center of rotational uncertainty for a line segment introduced in Section
4.2.8 can be naturally extended to a block. Augment the transformation matrix H P as
follows:
1 0 0 0 0
P
δψa 1 0 0 0
H PB = δψbP 0 1 0 0 , (5.13)
P
δρa 0 0 1 0
δρPb 0 0 0 1
117
with
where the various covariance terms P αα , Pρa α , etc. were defined in Eq. (5.2). It follows that
σα2 0 0 0 0
0 σρ2a 0 0 0
T
P B = H PB 0 0 σρ2b 0 0 H PB . (5.18)
0 0 0 σψ2 a 0
0 0 0 0 σψ2 b
where
1 0 0 0 0
−δψaP 1 0 0 0
HP−1 = −δψbP 0 1 0 0 . (5.20)
B
−δρPa 0 0 1 0
−δρPb 0 0 0 1
In the case where δψaP = δψbP and δρPa = δρPb , the diagonalizing transformation H PB
is of the form of a pose transformation matrix outlined in Eq. (5.13). The location of the
point ~uP in Cartesian coordinates, which represents the center of rotational uncertainty for
118
the block covariance, can be found as
xP δρPa cos(α) − δψaP sin(α)
~uP = = . (5.21)
yP δρPa sin(α) + δψaP cos(α)
B
uP
ρP
ψ
P
Because the uncertainties in position and orientation are uncoupled about a reference
frame whose origin lies at ~uP , the nonlinearities introduced by the lever arm effect are
minimized when the covariance matrix is represented with respect to that frame. Therefore,
when comparing or merging block features, the effects of inherent nonlinearities can be
minimized by performing these operations at or near a frame centered at point ~u P .
This section present methods to extract subfeatures from the block. Subfeatures are ge-
ometric elements of the block that may be useful to isolate for a targeted comparison or
feature merge. Here I present the simple processes to extract the underlying line segments
from the original block B. Section 4.2.9 introduced methods for extracting endpoints from
line segment features.
From this block feature one can extract the inner and outer line segments:
Sa = HSa B, (5.22)
Sb = HSb B, (5.23)
119
u
bb
u cb
Sb ψc
u ab Sc u cc
u ba
Sa
u ca
ρ
c
u aa
α
with
1 0 0 0 0
0 1 0 0 0
HSa =
.
(5.24)
0 0 0 1 0
0 0 0 0 1
1 0 0 0 0
0 0 1 0 0
HSb =
.
(5.25)
0 0 0 1 0
0 0 0 0 1
The line segment Sc , which corresponds to the center of the block, can be calculated as
Sc = HSc B, (5.26)
with
1 0 0 0 0
0 .5 .5 0 0
HSc =
.
(5.27)
0 0 0 1 0
0 0 0 0 1
The endpoints of these segments can also be easily calculated as shown in Eqs. (4.45)
and (4.49). See Fig. 5.5 for a graphical representation of these subfeatures of a given block
B.
120
5.3 Block Feature Extraction
The goal of the feature extraction process is to sort a set of n range points U = {u 1 , ..., un }
into m roughly collinear point subsets U k , k = 1, . . . , m while at the same time estimating
the block parameters (B, PB ). Note that m is not a predetermined value, and it will depend
upon the scale. Extraction of the m feature coordinates is an iterative, two step process. In
the first step, the underlying line coordinates [α, ρ a , ρb ] are extracted from the data using
an augmented Hough transform (see below). In the second step, the endpoints [ψ a , ψb ] are
estimated. Because of truncation performed during the endpoint estimation, the second
step can have an effect on the optimal estimation of the underlying infinite line coordinates.
Therefore, these steps are repeated iteratively until the process stabilizes, usually in three
or fewer iterations.
To extract the underlying infinite line coordinates [α, ρ a , ρb ], the range points are first
transformed using a multi-scale version of the Hough transform. Let us first briefly review
the classical Hough transform. Define a Hough space H(i, j) as a two-dimensional raster
with integer indices i and j indexing the variables ρ(i) and α(j), respectively. The variable
α(j) is discretized in increments of D α on the range [−π/2, π/2] and the variable ρ(i) is
discretized in increments of Dρ on the range [−dmax , dmax ], where dmax is the maximum
range value to be expected in the data point set. The discretization level D α is chosen as
a function of the discretization level D ρ and the maximum sensor range lmax :
Cell {i, j} of the discretized Hough space therefore represents the range of line coordinates
[ρ(i) ± Dρ /2 α(j) ± Dα /2]. The content of each cell in the Hough raster is initially set to
zero. For each range point, for all i one calculates the position ρ ik of the line at angle α(i)
that would pass through point k:
where Pδρk is the projection of the modeled range sensor measurement noise for point u k
onto the ρ axis and n is the total number of points in the block. Therefore the uncertainty
122
0 90
1.5
1 0.5
α (rad)
Robot pose
0
−0.5
α
0
−1
1000 mm
ρ (mm)
−4000 −2000 0 2000 4000
Figure 5.6: Multi-scale extraction of ρ a ,ρb : A) Raw scan points B) Hough transform.
Convolution Basis
α
0
ρa ρb ρ0
Hough space data at angle α0
Convolved data
1000 mm
Extracted block bounds (ρa ρb)
Extracted bounds uncertainty
ρ (mm) Extracted block bounds
−3000−2000−1000 0 1000 2000 3000 Extracted bounds uncertainty
Figure 5.7: Fine-scale extraction of ρ a ,ρb : A) Block ρ boundary detection at fine scale. B)
Detected infinite block.
Convolution Basis
α
0
ρ
ρ ρ 0
a b
Figure 5.8: Coarse-scale extraction of ρ a ,ρb : A) Block ρ boundary detection at coarse scale.
B) Detected infinite block
Endpoint detection is an analogous process but performed on the raw data points instead
of in Hough space. Project all points contributing to the line [α, ρ a , ρb ] onto the underlying
line. Then convolve this signal with the derivative of a Gaussian at a given scale σ ψ and
detect the maximum and minimum peaks. These peaks determine the endpoints of the
feature ψa and ψb . Fig. 5.9 shows the endpoint extraction process for the fine scale infinite
block shown in Fig. 5.7.
Like the ρ covariance terms, the variance terms P ψa ψa and Pψa ψb in PB are set to be
124
equal to the sum of the variance of the given Gaussian basis and the process noise:
ψa
Pψa ψa = (σψ )2 + Pnoise , (5.32)
ψb
Pψb ψb = (σψ )2 + Pnoise , (5.33)
ψ
where the Pnoise terms are calculated by projecting the measurement noise uncertainty of
the distal points into the ψ axis. The cross terms involving the variable α in the covariance
matrix PB are computed assuming the center of rotational uncertainty of the block is
anchored at the center of the block. The remaining cross-coupling terms in the covariance
matrix PB are set to zero at the time when the feature is extracted, as I assume that the
noise contributions from the range sensor independently effects the variables ρ a , ρb , ψa , and
ψb .
Convolution Basis
ψa ψb
As each feature is detected, the points contributing to that feature that lie inside the
bounds of the block are removed from the candidate point set and the algorithm is repeated,
detecting features from the set of unselected points until no points remain. The result is
125
a set of m blocks and covariances {B k , PBk }σρ , k = 1, ..., m extracted at scale σρ with a
corresponding set of point groups {u k }, k = 1, ..., m. Each point group uk is the set of range
data points associated with the k th block feature. By design, the point groups are disjoint,
so that different features at the same scale cannot share underlying points. Fig. 5.10 shows
the Hough space and extracted block for the subsequent feature extracted from the data in
Figs. 5.6 and 5.9.
α0
0.5
α (rad)
0
−0.5
−1
1000 mm
Prior block
ρ (mm) Extracted block bounds
−4000 −2000 0 2000 4000 Extracted bounds uncertainty
The following three methods achieve additional computational savings for the multi-scale
Hough transform based extraction method introduced above.
Prior Estimation: A prior estimate for the bounds of the block can be used to improve
efficiency. This estimate can come from a feed-forward approach from prior detected data,
or from an extracted feature from the current data at a coarser scale. In these cases, the
Hough space can be centered on the coordinates of the input feature, and bound the size
of the Hough space by the input feature bounds in ρ and α. The extracted block feature is
then transformed back into the global frame.
This section extends the line segment data association approach introduced in Section 4.4
the case of block features. Practically speaking, it is important to know if two blocks are
likely to arise from similar underlying range data. This basic matching problem is used in
many mapping and localization procedures. A block is a feature that defines a bounded
area, and they are used to represent a one-dimensional contour in the environment defined
by the boundary of the sensed obstacles. While this gives us the benefits of flexibility and
abstraction of the underlying contour, it also can mean that even equivalent contours may
be represented by a different but equally accurate set of block representations. Therefore
consider the following two sources of possible differences when comparing blocks across
scans:
1) Error in the precision of the extraction process and in the relative pose estimate between
the frames at which each feature was detected.
Figure 5.11: Example of scale-based difference in the block feature representation of identical
point data.
Note that the scale-based difference does not represent an estimate error, and I do not
model this difference in my block extraction methods in Section 5.3. I instead develop a
two-stage method of comparing blocks across scans, which compensates for possible scale-
based differences while still maintaining a narrow threshold of comparison in the case where
there is no scale-based difference.
The first stage of testing attempts to validate the “scale overlap hypothesis.” This
test matches pairs of blocks that may describe the same underlying contour within the
flexibility afforded by block scale and length as well as process noise. If two features pass
this test they are considered a valid match, but there could exist a scale-based discrepancy
in block coordinates between them such that the pair is not a good candidate to merge and
contribute to robot localization. Therefore I develop a second stage of tests that validate
the “parameter match hypotheses.” These tests check the correspondence of the parameters
of the block and determine what aspects of the geometry of the block representations are
equivalent within process noise uncertainty. It is the equivalences found in the second stage
of tests that define the matchable aspects of two blocks in localization and mapping schemes.
The first part of this section introduced definitions and assumptions used throughout
the section. Next a set of hypotheses are developed to test if a pair of block features might
describe the same underlying data. This set of tests allows for differences between the blocks
due to feature scale. After methods are presented to test the block subfeatures for partial
matches. Finally a test that can be used to establish confidence in a potential match is
introduced.
128
5.4.1 Matching Definitions and Assumptions
The following tests of these hypotheses assume there is a candidate match of two block
features Bi and Bj with representation defined in Section 5.2.1 and reproduced here as
follows:
αi αj
i j
ρa ρa
Bi = ρib Bj = ρjb . (5.34)
i j
ψa ψa
ψbi ψbj
The covariance matrices PBi and PBj for these blocks are calculated according to Eq.
(5.2). Section 5.3 outlined a method of extracting these features and covariances. A further
assumption is that all parameters are represented with respect to a common reference frame.
It can be beneficial due to nonlinearities associated with the polar block representation to
perform the comparisons and subsequent merging steps in a common reference frame with
respect to the center of rotational uncertainty for the combined covariance of the blocks.
This entails calculating the Cartesian point ~u P associated with the covariance PBi + PBj
according to the equations in Section 5.2.4 and setting the common reference frame origin
~P . Section 5.2.3 outlined methods of transforming the blocks and covariances
to be at V
across frames.
The first set of hypotheses addresses whether it is possible that two block features describe
the same underlying physical contour. This is the broader of the two sets of tests, as it
compares block coordinates up to the resolution allowed by the inherent sensing process
noise as well as the scale of the blocks. Just like the notion of “overlap” in the ψ dimension
for a line segment found in Section 4.4.2, I now extend that idea into the orientation α
dimension and the normal position ρ dimension. The following hypotheses are developed
to test the parameter overlap of two block features:
Overlap Hypothesis 1: Block Bi from pose i and block Bj from pose j overlap in the
range of plausible orientations of the underlying contour.
Overlap Hypothesis 2: The blocks Bi and Bj define ranges in normal position ρ that
129
have some overlap. Hypothesis 1 must be true.
Overlap Hypothesis 3: The blocks Bi and Bj describe ranges in the parallel position ψ
that have some overlap. Hypothesis 2 must be true.
The hypotheses are tested via methods based on the chi-square distribution.
When describing an underlying one-dimensional contour using a feature with some non-zero
width, there can be a range of alignments of the contour that still fall within the bounds of
the feature. The block feature representation does not hold an explicit range of orientation
due to the feature scale, but an estimate can be derived by computing a margin for the
range of orientations that the block can take to describe a contour that is an infinite line.
The angles of the block diagonals of blocks B i and Bj can be calculated as follows:
−1 ρib − ρia
∆iα = tan , (5.35)
ψbi − ψai
!
−1 ρjb − ρja
∆jα = tan . (5.36)
ψbj − ψaj
These values of ∆iα and ∆jα correspond to the maximum relative orientation that an infinite
line could have and still pass through both end segments of the block and not intersect
the upper and lower normal block boundaries. So for block B i there is a range of possible
orientations of the underlying contour defined by [−∆ iα , ∆iα ] and for block Bj the range is
[−∆jα , ∆jα ]. The hypothesis is considered to be valid if there is some overlap in these angle
ranges.
A threshold for the scale-based flexibility in orientation of the block pair can be defined
as ∆ij
α , where
∆ij i j
α = ∆α + ∆α (5.37)
and then develop the following piecewise Mahalanobis distance metric for the block angle:
If |αi − αj | ≤ ∆ij
α then
D 2 = 0, (5.38)
130
and if |αi − αj | > ∆ij
α then
(|αi − αj | − ∆ij
α)
2
D2 = . (5.39)
i + Pj
Pαα αα
else, D measures the difference in orientation. For very narrow blocks extracted at a very
fine scale, the calculation approaches the standard Mahalanobis distance metric in α. D 2 is
then compared with a threshold χ2 set from a chi-square distribution with a single degree of
freedom. If D 2 > χ2 the hypothesis is invalidated, and one can conclude that the features
do not overlap. If the test fails, the subsequent hypotheses for this feature pair are not
tested.
An overlap in the ρ dimension is tested for those block features with a non-zero width.
Similarly a piecewise Mahalanobis distance metric in the ρ dimension is used as the basis
for a chi-square test. The test requires computing the values of ρ ic and ρjc , which are the
distances to the center of the block:
ρia + ρib
ρic = , (5.40)
2
ρja + ρjb
ρjc = , (5.41)
2
(5.42)
Let ∆ij
ρ be the maximum distance between the center values that would result in any overlap:
wi + wj
∆ij
ρ = . (5.45)
2
D 2 = 0, (5.46)
The test for the block length overlap in the φ dimension is identical to the test outlined for
a line segment in Section 4.4.2. For reference, the piecewise Mahalanobis distance metric
in the ψ dimension is repeated here. First, the center values of ψ ci and ψcj along the line
segments are computed as
ψai + ψbi
ψci = , (5.49)
2
ψaj + ψbj
ψcj = , (5.50)
2
overlap:
`i + ` j
∆ij
ψ = . (5.53)
2
D 2 = 0, (5.54)
2
(ψci − ψcj − ∆ij
ψ)
2
D = , (5.55)
Pψi a ψa + Pψj b ψb
2
(ψci − ψcj + ∆ij
ψ)
2
D = . (5.56)
Pψi b ψb + Pψj a ψa
If D 2 > χ2 , one can safely invalidate the hypothesis and conclude that the features do
not overlap.
By validating all three overlap hypotheses, two blocks may describe the same underlying
structure, up to the flexibility allowed given this scale and length of the feature. If a pair
of blocks passes these three hypothesis tests, it is assigned as a match. These test are quite
flexible, especially when comparing features at a coarse scale. The tests developed below
check for the possible correspondences of the individual parameters that define the block
boundaries, which are more tightly defined.
(αi − αj )2
χ2 < j
. (5.57)
Pαα i
+ Pαα
Matching Hypothesis 2: The inner block normal bounds ρ ia and ρja correspond between
the blocks. Matching hypothesis 1 must also hold. The hypothesis is false if the following
holds:
T −1
i i j j
α i − α j P αα P αρ P αα P αρ a α i − α j
χ2 < a
+ . (5.58)
i j i i j j i j
ρa − ρ a P ρa α P ρa ρa P ρa α P ρa ρa ρa − ρ a
Matching Hypothesis 3: The outer block normal bounds ρ ib and ρjb correspond between
the blocks. Matching hypothesis 1 must also hold. The hypothesis is false if the following
holds:
T −1
i i j j
α i − α j P P P αα P αρ α i − α j
χ2 < αα αρb
+
b
. (5.59)
j j j
i
ρb − ρ b i i
P ρb α P ρb ρb P ρb α P ρb ρb ρib − ρjb
Matching Hypothesis 4: The lower block end bounds ψ ai and ψaj correspond between
the blocks. Matching hypothesis 1 must also hold. The hypothesis is false if the following
holds:
T −1
i i j j
αi − α j Pαα Pαψ Pαα Pαψ αi − α j
χ2 < a + a . (5.60)
ψai − ψaj Pψi a α Pψi a ψa Pψj a α Pψj a ψa ψai − ψaj
Matching Hypothesis 5: The upper block end bounds ψ bi and ψbj correspond between
the blocks. Matching hypothesis 1 must also hold. The hypothesis is false if the following
134
holds:
T −1
i i j j
αi − α j Pαα Pαψ Pαα Pαψ αi − α j
χ2 < b + b . (5.61)
ψbi − ψbj Pψi b α Pψi b ψb Pψj b α Pψj b ψb ψbi − ψbj
For a perfect block correspondence, all five parameters of the blocks would be determined
to match. In practice, often a smaller subset of the parameters match either due to the
differences in scaled feature fitting discussed above, or due to actual differences in the
underlying data from occlusion effects or a changing environment. In my merging and
mapping methods, I use these parameter matching tests to guide which elements of the
block pair will be merged to help localize the robot.
It is important to note that in the case where multiple parameters pass the above tests,
for simplicity I do not take into account the possible cross coupling between the ρ and ψ
parameters in the block feature covariance matrices. The result is a more lenient test with
the possibility of accepting a set of hypotheses that would be rejected if the full covariance
of each of these terms were applied in a combined chi-square test. In general, the effect of
ignoring this coupling when establishing correspondences is small. If needed, it is possible to
introduce a second level of chi-square testing that assembles all terms which pass the above
tests and compares them using the corresponding covariance matrices. For example, in the
case where all of the above tests pass, the testing could be further refined by computing
−1
χ2 < (Bi − Bj )T PBi + PBj (Bi − Bj ), (5.62)
where χ2 is the chi-square threshold defined for a five degree of freedom system. Similar tests
can be constructed for other combinations of matches. In the case where the combined test
fails, the parameter with the highest Mahalanobis distance calculated above can be thrown
out and the combined comparison repeated with the remaining parameters.
This section develops a test that complements the chi-square based tests above, by devel-
oping metrics that assess the confidence in a match or set of matches. As mentioned in
Section 2.1, the chi-square test is effective at filtering out possible bad matches, but it does
not alone give a measure of confidence in the truth of the given hypothesis. For example, a
135
block feature measurement with a huge level of uncertainty in all dimensions would match
a large range of differing blocks with a very low set of Mahalanobis distance D calculations,
but these low distance metric calculations give us no hint that a false positive match is very
likely for that feature. This effect is especially strong when considering my scale overlap
comparison test outlined above, which increasingly opens up the matching method to the
possibility of false positives, especially at a coarse scale.
At the core of the match confidence test is an estimation of the probability of a false
positive between two features given their scale and noise properties. First I define the full
configuration space of a block feature S as a three-dimensional space bounded by [−π, π]
in the α dimension and by [−dmax , dmax ] in the ρ and φ dimensions where dmax is the
maximum spatial dimension of the area that is being compared.
First consider the probability that block orientations will match by chance. Consider
the maximum angle difference that will pass the first hypothesis of angle overlap outlined
in Section 5.4.2. To calculate this the value of D 2 in Eq (5.39) is set to be equal to the
chosen threshold value χ2 as follows:
(|αi − αj | − ∆ij
α)
2
χ2 = . (5.63)
i + Pj
Pαα αα
q
¯ ij = ∆ij +
|αi − αj | = ∆ i + P j ).
χ2 (Pαα (5.64)
α α αα
in the above calculation depend on the actual block position, only on inherent properties of
each block. The probability that two blocks will match in angle given a uniform distribution
¯ ij
in possible difference in angles can be denoted as the ratio ∆ α /(2π). Similarly in the ρ and
The overall probability of a match M ij between blocks Bi and Bj can therefore be estimated
136
as !
¯ ij
∆ ¯ ij
∆ ¯ ij
∆
α ρ ψ
P(Mij ) = . (5.67)
2π 2dmax 2dmax
This value of P(Mij ) gives us a notion of what the significance of a given match between two
block features would be, with a lower probability suggesting a rarer occurrence by chance
and therefore a more meaningful match. This notion of match significance is useful when
considering groups of matches, and is used in developing a threshold to determine whether a
set of matches at a certain scale is good enough to have confidence in the position estimate
from the current scale or whether it is necessary to proceed to a finer scale.
A scale tree is a tree of block features extracted from a common data set across multiple
scales. Parent-child connections on the tree are established for features at different scales
wherein the child feature at a finer scale has been extracted from a subset of the data
encompassed by the parent. Before construction, the set of discrete scales that corresponds
to each level of the tree is determined. As is common in multi-scale methods [Lin94, Low99],
a set of scales is generated that increases at each step from fine to coarse by factors of 2.
There are two methods of building a scale tree. The first is bottom-up construction
started by extracting features at the finest scales and combining them into coarser scale
features while climbing the tree. The second is top-down construction, started by extracting
the coarsest features and then extracting finer and finer scale subfeatures from the coarser
features.
If the finest scale of features needed is known a priori, a scale tree can be efficiently con-
structed from the bottom-up. First a full set of features is extracted at the finest scale as per
the methods in Section 5.3. The features can be extracted at subsequent scales either from
the raw data or by combining the features themselves. In the raw data approach the blocks
are extracted at each scale just as before but some computational efficiency can be gained
at each subsequent scale by reusing and sub-sampling the Hough transform calculations
computed for the previous finer scale (much like building a Gaussian scale tree). Alterna-
137
tively the finer scaled features can be merged together into coarser features by expanding
each feature by the scale difference and then matching and merging the coarser features.
A scale tree can also efficiently be built using top-down construction where the coarsest
scales are extracted first. This method is especially effective in applications where it isn’t
initially known how fine a scale will be needed to go to achieve success. In this construction
method a set of features is first extracted from the raw data at the coarsest scale. The raw
data points are then separated into subgroups corresponding to the coarse features that
encompass the data. Subsequent feature extraction at finer scales is performed on these
subgroups of points independently. This allows for improved computational efficiency at
the finer scales as the smaller range of these subgroups of points can result in a narrower
parameter search as discussed in Section 5.3.4. Each feature at the coarser scale has a
parent-child relationship in the scale tree with the finer scale features extracted from its
point subgroup.
Another significant computational benefit can arise if the application can be effective
when using a subset of data at a finer scale. In this case, the top-down scale tree can be
constructed such that the child features of only some of the features at the coarser scale are
computed.
This work focuses on the top-down construction method, the results of which can be
seen in Fig. 5.1A–E.
When establishing correspondence using scale trees, knowledge of the tree’s structure can
help to maximize matching efficiency and enable increased robustness to unmodeled errors.
The most straightforward method of establishing correspondence is to build scale trees of
two data sets and establish correspondences between features across the sets. This method
traverses from the coarse to fine scales on both trees, establishing correspondences at each
scale down to the finest scale features. In this case each search for a correspondence pair
can be abbreviated to only compare the children of two features that have corresponded at
a coarser scale.
138
While the relative computational improvement of a tree based method over the direct
calculation of correspondence of the features at the finest scale will depend upon the data
set, the improved efficiency is inherent in this method, as shown in the following simplified
analysis. Assume that the goal is to determine the correspondence between N features in
two different range scans. For the sake of simplicity, assume that N = 2 m for some integer m.
The process of finding correspondences between the N features in two scans has complexity
βN 2 , where β is a scaling coefficient that depends upon the details of the correspondence
method. Now consider the computation involved in finding correspondences with the scale
trees. Assume for the sake of a simplistic argument that the scale tree is dyadic — each node
has two children nodes. At the coarsest scale, only search for correspondences between two
pairs of features, which results in a computational cost of β2 2 . At the next finer scale, there
are four features to check for possible matches. However, it is only necessary to check for
correspondences among the features of the children that descend from the parent nodes that
were found to be in correspondence at the coarser scale. This results in a computational cost
of 2β22 . Continuing in this fashion for log(m − 1) levels in the tree, it can be determined
that the multi-scale version of correspondence requires computational effort of β(N − 1)2 2 .
Thus, for the specific case of a dyadic tree, my method should scale linearly with the number
of features, as opposed to quadratically for a fine scale analysis. While not all scale trees will
be dyadic, clearly there are substantial computational savings to be had with this approach.
Fig. 5.12 presents an example using data collected from a Sick LMS-200 range scanner
in an indoor office environment. All computations and timing estimates are done using
the Matlab programming environment running on a 2.0 Ghz Athlon PC with 1 GB RAM.
The first set of data, termed “scan 1,” is actually the same data found in Fig. 5.1B. The
second set of data, termed “scan 2,” was taken at a nearby robot pose. Figs. 5.12A,B show
the scale tree and the finest features extracted from scan 1 and Figs. 5.12C,D show the
scale tree and finest features extracted from scan 2. For this example I assume a known
and somewhat accurate estimate of the displacement between the two poses (such as might
be provided by odometry). The goal is to determine the correspondences between the
features at the finest scale of each pose. There are 52 fine features detected in scan 1 and
53 fine features detected in scan 2. The total number of feature-to-feature comparisons
139
200
200
Scale (mm)
Scale (mm)
100
100
25 50
25 50
Matched feature nodes Matched feature nodes
Robot pose 2
Robot pose 1
1000 mm 1000 mm
Figure 5.12: Multi-scale range scan representation: A) Scale tree for pose 1. B) Finest
scale features pose 1. B) Scale tree for pose 2. D) Finest scale features for pose 2. E)
Corresponding features from pose 1, pose 2.
carried out at the finest scale by an exhaustive search is 2756, which takes 0.55 seconds
of processing time. The multi-scale version proceeds by using the chi-square test at every
scale to check for matches between features. When feature matches are found (i.e., the
140
feature pairs pass the chi-square test), then the child nodes of the matching features are
compared with the chi-square test. This approach significantly reduces the search space.
When comparing the same data set and taking advantage of the scale tree structure, the
same set of correspondences is extracted using only 470 comparisons in 0.18 seconds. I
repeated this example for 100 pairs of unique scans in similar environments. The results
showed an average time for the exhaustive search of 0.329 seconds for 1708 comparisons,
and an average computational time for the multi-scale matching of 0.086 seconds for 274
comparisons. These results show a nearly fourfold decrease in computation time and more
than a sixfold decrease in computational complexity using my multi-scale approach.
Here I consider multi-scale matching for relative robot displacement estimation. I show
that the multi-scale approach leads to improved robustness with respect to perturbed initial
conditions.
This example focuses on the most basic process of registering two scans, which can be used
in a scan-matching odometry process (as in Chapter 3), or as part of the solution to the
kidnapped robot problem. I consider two scans taken at different poses (the same scans as in
the last example), and seek to estimate the relative displacement between these poses. My
method starts at the coarsest scale and extracts block features from each scan and computes
feature correspondences given an initial (but not necessarily accurate) displacement estimate
(e.g., from odometry). These initial correspondences are used to correct the displacement
estimate, which is then applied when extracting features and computing correspondences at
the next finer scale. This method is repeated at each scale down the scale tree to finest scale
features, where the most accurate displacement estimate is computed. While the same pair
of range scans is used as seen in Figure 5.1, I have purposely introduced a significant error
in the initial estimate of displacement between the poses. The error introduced is consists
of 2 meters in displacement and 10 degrees of relative rotation. In this way, the robustness
of the matching process to a poor initial displacement guess can be tested.
Figure 5.13A shows the two scans overlaid with this initial displacement error and the
141
Scale = 400 mm
True pose 2
Perturbed pose 2
Pose 1
Features, pose1
Features, pose2
Figure 5.13: Multi-scale localization example where the blue circle is pose 1, the red circle
is the estimated pose 2, and the black circle is the actual pose 2. A) Initial pose esti-
mates and raw scans. B) Coarse feature fit. C–G) Intermediate pose estimates and feature
correspondences at each scale.
142
initial corresponding coarse scale features. The green ellipses in these plots represents
the 3σ bounds of uncertainty of the poses. Figure 5.13B shows the poses after initial
correction from the correspondences at the coarsest scale. The subsequent Figs. 5.13C–G
show the corresponding feature sets at increasingly fine scales as the displacement estimate
improves. Figure 5.13H shows the final point overlay with the corrected displacement
estimate. Note that conventional single-scale correspondence and displacement estimation
algorithms [LM97b, PKRB02], were unable to establish correspondences between the fine
scale features. Thus, this example shows that the multi-scale approach can significantly
improve robustness to initial displacement errors while maintaining accurate displacement
estimates.
This section introduces my extended Kalman filter based SLAM (simultaneous localization
and mapping) algorithm, which uses scaled block features as the primary representation
of the environment. Some general background information on the extended Kalman filter
(EKF) is introduced in Section 2.3 and the derivation of my line segment–based EKF is
introduced in Section 4.6. This section will extend the line segment EKF to account for
block features.
The state vector X at time step k of the EKF takes the following form:
x
y
φ
Xk =
,
(5.68)
B1
...
Bn
k
where [x, y, φ] represents the robot’s pose and B 1 ...Bn the block features added to the filter.
All variables are represented with respect to a common global reference frame and the
representation of the block feature is shown in Eq. (5.1). Note that the length of the state
143
vector is m = 5n + 3. The covariance for the EKF is defined as P Xk and is represented as
an m × m matrix that maintains all cross-correlations of each feature and the robot’s pose.
The propagation equations for the block based EKF are identical to those derived for the
line segment–based EKF in Section 4.6.2.
Here I consider a sensed block feature in the environment B̄ measured in the local coordinate
frame. If this feature is confirmed to correspond with the c th existing feature denoted as
Bc(k) , then the following process is used to incorporate that information to update the
robot state X and covariance PX . I present multiple variations of the update equations
that depend on the results of the parameter hypothesis tests introduced in Section 5.4.3.
Similar to my line based EKF update approach in Section 4.6.3, my method allows for
partial correspondences of features, and only updates the filter with the information given
by the parameters verified to match.
The following nonlinear function h k () represents the coordinates of measurement B̄ in
terms of the current state X and a set of measurement perturbations v k = [α , ρa , ρb , ψa , ψb ]:
B̄ = hk (Xk , vk )
αl αc(k) + α − φk
a a
ρl ρc(k) + ρa − xk cos(−αc(k) ) + yk sin(−αc(k) )
b b
ρl = ρc(k) + ρb − xk cos(−αc(k) ) + yk sin(−αc(k) ) . (5.69)
a a
ψl ψc(k) + ψa − xk sin(−αc(k) ) − yk cos(−αc(k) ) + α ρl
ψlb b
ψc(k) + ψb − xk sin(−αc(k) ) − yk cos(−αc(k) ) + α ρl
The general update equations for the extended Kalman filter are as follows.
−1
Kk = Pk|k−1 HkT Hk Pk|k−1 HkT + Vk PB̄ VkT , (5.70)
Pk = (I − Kk Hk )Pk|k−1 , (5.72)
144
where Hk and Vk are the Jacobian matrices of partial derivatives of h k () with respect to Xk
and vk calculated at each step k. It is these matrices, H k and Vk that change depending on
which parameters of the block that have been determined to correspond.
I first develop the case where the feature pair has passed all tests for parameter correspon-
dence in Section 5.4.3 and therefore all parameters of the block correspond between the
sensed feature B̄ and the stored feature Bc . For simplicity I drop the subscript k from each
term that denotes the k th time step of the filter. The Jacobian matrix of partial derivatives
of h() with respect to X can be computed in two parts as follows:
0 0 1
− cos(αc ) − sin(αc ) 0
Hg = − cos(αc ) − sin(αc ) 0 , (5.73)
sin(αc ) − cos(αc ) 0
sin(αc ) − cos(αc ) 0
1 0 0 0 0
x sin(αc ) − y cos(αc ) 1 0 0 0
HBc = x sin(αc ) − y cos(αc ) 0 1 0 0 , (5.74)
x cos(αc ) + y sin(αc ) 0 0 1 0
x cos(αc ) + y sin(αc ) 0 0 0 1
with Hg relating to the derivatives of h() with respect to the pose portion of the state, and
HBc relating to the derivatives with respect to the c th feature of the state. The complete
H matrix can therefore be assembled as
h i
H = Hg 0 ... HBc ... 0 , (5.75)
where the position of HBc in the matrix corresponds to the position of B c in X. The matrix
V can similarly be calculated
145
1 0 0 0 0
0 1 0 0 0
V = 0 0 1 0 0 , (5.76)
ρl 0 0 1 0
ρl 0 0 0 1
where
ρl = ρc(k) − xk cos(−αc(k) ) + yk sin(−αc(k) ). (5.77)
For a full feature match, these values of H and V are applied to Eqs. (5.71) and (5.72) to
calculate the updated state and covariance.
In the case where a subset of the feature coordinates has been determined to correspond in
a feature pair, the H and V matrices are adjusted by setting the row corresponding to the
unmatched coordinate to be all zeros. See Section 4.6.3 for a more detailed discussion on a
partial feature update for the EKF. In a partial feature update where some block boundaries
do not correspond, it may still be useful to extend the stored block feature given the new
measurement. After merging the portions of the block that were found to correspond, the
block boundary parameters can be extended in the state vector by applying the measured
block boundary values. When replacing these parameters, the covariance matrix can be
adjusted to reflect the uncertainly of this new measurement as well.
My approach to block based SLAM utilizes the methods developed above. Given an initial
state X and PX consisting only of the robot pose and pose covariance, the mapping process
can be described in the following set of steps:
Step 1) At the current pose a range scan is taken and a set of features is extracted using
methods from Section 5.3.
Step 2) Each extracted feature is compared with the stored features in the Kalman filter
state X using the methods in Section 5.4.
Step 3) Features that are shown to correspond are updated according to the process out-
lined in Section 5.8.3.
Step 4) The set of sensed features that are isolated are added to the state as shown in
Section 5.8.3.
Step 5) Propagate the state given a robot displacement g according to the methods in
Section 5.8.2. Go to Step 1.
Using this process maps can be built that accurately and efficiently describe the envi-
ronment while localizing the robot during map construction. Note that this method only
builds the KF with features at a single chosen scale. A multi-scale map can be built by
maintaining a map representation at every scale. This can be done by running multiple
KFs, but the errors would be larger at the coarser scale and so a scale tree would be dif-
ficult to construct for these data. Alternatively, one can apply this SLAM method only at
the finest scale, and with every new data set, solve for the current position. This position
can then be used as a fixed measurement at which to merge the coarser scale data to the
corresponding coarse map representations.
This section formulates the kidnapped robot problem, and presents a multi-scale solution
that offers significant performance benefits over single-scale methods. The kidnapped robot
problem describes the rare but challenging situation when the robot loses complete knowl-
edge of position while navigating. Once a new set of range data is taken, the algorithm must
147
determine if the robot is somewhere on the map, and if so then it must determine where the
robot is. A solution to the kidnapped robot solution can be thought of as a search algorithm
where, at the worst case, all possible combinations of position and orientation are checked
to see if the new collected data would match with the map data at that hypothetical pose.
I apply a multi-scale approach to feature representation and comparison to greatly reduce
the computational complexity of this search.
To implement a test of the multi-scale kidnapped robot problem, I start with the rep-
resentation of the prior map built using a Kalman filter described above at the finest scale.
At each step in the filter successively coarse scale from the data is merged as well using the
position solution computed at the coarse scale. The result is a multi-scale representation of
the full map as shown in Figures 5.14 and 5.15 on the left side. The data on the right side
of these figures represent the new kidnapped robot data at each scale.
The multi-scale search starts at the coarsest scale. Candidate poses are selected that
consist of aligning one end of a chosen pair of features. These hypotheses are then tested
using a simple test of block overlap. If the new data overlap more than some threshold
across the old data then the hypothesis passes, and the search continues at that pose. If
the test at the coarsest scale fails, then a new hypothesis is generated using a different
coarse feature pair. Figure 5.16 shows only a few of the hypotheses that are rejected at the
coarsest scales. For hypotheses that pass the initial test, the data sets are both compared
at the next finer scale using the same pose hypothesis. These tests continue until there is
a failure or the finest scale is positively validated. Figure 5.17 shows two of the hypotheses
that pass the first two hypothesis tests at the coarsest scales, but fail the third, while Figure
5.18 shows the alignment for the successful solution. This same solution can also be found
using an exhaustive search of candidate poses directly at the finest scale, but experimental
results show this to be computationally impractical.
This algorithm is implemented and profiled in Matlab on a 2.0 Ghz Athlon PC with 1 GB
RAM. I ran the kidnapped robot algorithm for 50 separate data sets and all successfully
converged to the proper solution. The full search over possible locations took an average
time of 9.65 seconds per run to complete. The average time until the first solution was found
is 2.74 seconds. The average number of hypotheses tested at each scale is 126.4 at 200 mm,
148
1000 mm 1000 mm
Block Features Block Features
Uncertainty Bounds Range Data Points
1000 mm 1000 mm
1000 mm 1000 mm
Figure 5.14: Kidnapped robot problem data: Multi-scale map and candidate scan repre-
sentation at scales of 200 mm, 100 mm, and 50 mm.
149
1000 mm 1000 mm
1000 mm 1000 mm
Figure 5.15: Kidnapped robot problem data: Multi-scale map representation and candidate
scan representation at scales of 25 mm and 12.5 mm.
150
1000 mm 1000 mm
1000 mm 1000 mm
1000 mm 1000 mm
1000 mm 1000 mm
1000 mm 1000 mm
Figure 5.17: A selection of two hypotheses with partial validation at the coarsest scales but
invalidated at the 50 mm scale.
152
1000 mm 1000 mm
1000 mm 1000 mm
1000 mm
Figure 5.18: A solution to the kidnapped robot problem with validated hypotheses at all
scales.
153
27.26 at 100 mm, 6.9 at 50 mm, 1.68 at 25 mm, and 1.0 at 12.5 mm. That means that all
of the final hypotheses passed the final check and no false hypotheses made it through the
25 mm scale test.
A single-scale exhaustive search at the fine scale is not computationally feasible. There
are 411 features in the full map at the finest scale and an average of 45 features in the
candidate data at the finest scale. Given that there are two hypotheses possible for each
pair of lines the total number of hypotheses to check is 36,990, which would require over
a half an hour of computation time.. This compares to an average total of 163.24 scan
comparisons for the multi scale approach. It is possible to formulate this exhaustive, single
scale search in a computationally feasible way for a more useful comparison. If the initial
candidate poses are selected from the longest lines first, and the algorithm quits as soon
as a match is found, then the single-scale kidnapped robot problem completes in a much
more reasonable average of 25.3 seconds for these same 50 runs. This result is still nearly
a factor of 10 slower than the 2.74 seconds average it takes for the multi-scale approach to
determine the first solution.
A second set of tests were run with 30 scans that were taken in different rooms, not on
the map. In these tests there were no false positives and the average processing time per
run was down to 8.3 seconds due to the early rejection of the majority of hypotheses. The
single-scale kidnapped robot approach is not a useful test in this case where the scan may
not lie in the map at all.
154
5.10 Multi-Scale Mapping and Localization Conclusions
This chapter presented novel methods to extract, compare, and merge a new multi-scale
feature. The multi-scale feature representation extends the line segment feature represen-
tation used in the previous chapter and inherits many of the similar traits of that feature.
The addition of the notion of scale allows for more flexibility in representing range scan
data. This chapter also introduces a notion of a scale tree structure containing multiple
representations of the same data at different scales. For applications such as feature cor-
respondence and the kidnapped robot problem where the number of comparisons can be
height, the use of a scale tree structure can result in significant computational benefits.
155
Chapter 6
I have developed and presented three approaches to localization and mapping: the point
based weighted scan matching method of Chapter 3; the line segment feature method of
Chapter 4; and the multi-scale feature based method of Chapter 5. Each of these methods
was implemented in software, and the benefits over comparable prior work were demon-
strated with real data.
A common theme in my approach has been the rigorous treatment of noise modeling.
The point based weighted scan matching method uses detailed sensor noise models directly
in the localization method to improve accuracy and robustness. Both the single- and multi-
scale feature-based approaches use these detailed sensor models in the feature extraction
process. This rigorous treatment of noise modeling results in an accurate model of feature
uncertainty, which in turn benefits the entire localization and mapping process. Though
accurate sensor modeling can degrade computational performance at times, my work shows
that for accuracy and overall robustness of the localization and mapping method, it desirable
to develop accurate models. A fast localization method is of no use if it is inaccurate.
Another common theme has been the careful consideration of the methods used to
establish correlation across data sets taken at different robot positions. The weighted scan
matching method compensates for correspondence errors through the explicit modeling of
that error. The feature based methods outline a set of correspondence tests that allow for
partial correspondence of features. This ability to establish partial correspondence enables
a finer grained data association process, which results in more and better matches. The
accurate modeling of the data uncertainty is also critical for the data association problem,
as the chi-square based hypothesis tests are only as good as the modeled uncertainty values.
Likewise the localization and mapping methods are only as good as the underlying data
156
association approach.
My contribution to line segment feature localization and mapping methods offers the ca-
pability of fully and efficiently representing arbitrary data sets. Through the representation
and compensation of nonlinear effects, I have developed a flexible feature that can be used
to represent very long line segments or short point-like segments. This approach also allows
for the representation of intermittently spaced line segments. The benefits of this approach
were demonstrated in Section 4.7.1 by an example that compares my method with prior
methods. In future work and testing, this flexible representation will naturally lend itself to
outdoor applications, where it will not suffer as much as other line based methods for the
lack of structure in the environment. Also, it will be more robust to occlusions found in an
environment where many mobile objects move in and out of the field of view. Further tests
and developments can be done to test the limits of this robustness.
The multi-scale localization and mapping approach offers a more significant departure
from prior work in the field. A novel approach to the feature extraction process was intro-
duced, along with the methods to compare and merge features at any scale. The result is
a detailed framework of block feature based methods that are flexible and can be applied
to existing localization and mapping techniques. The multi-scale feature showed significant
computational benefits when applied to feature correspondence determination and the kid-
napped robot problem. In future work, the multi-scale approach can be used to further
improve the efficiency of overall localization and mapping methods by enabling only a par-
tial construction of the scale tree. In such a case, only the necessary subset of features
would be extracted at finer scales for localization. Also, the multi-scale methods could be
further tested in a less structured, outdoor environment. The coarse scale features would
allow for more efficient representation of difficult to model obstacles, e.g., foliage or other
textured obstacles.
Though each of the three methods presented here stand on their own with significant
contributions to the field, the multi-scale approach is the culmination of the lessons learned
from the point-based and line segment–based approaches. The results of the multi-scale
approach suggest significant potential for its continued use in localization and mapping
methods.
157
Bibliography
[ABL+ 01] M.C. Amann, T. Bosch, M. Lescure, R. Myllylä, and M. Rioux. Laser ranging:
a critical review of usual techniques for distance measurement. Opt. Eng.,
40(1):10–19, Jan. 2001.
[Ada00] M.D. Adams. Lidar design, use, and calibration concepts for correct environ-
mental detection. IEEE Trans. Robotics and Automation, 16(6):753–761, Dec.
2000.
[AH93] S. Atiya and G.D. Hager. Real-time vision-based robot localization. IEEE
Trans. on Robotics and Automation, 9:785–800, Dec. 1993.
[AP96] M.D. Adams and P.J. Probert. The interpretation of phase and intensity data
from AMCW light detection sensor for reliable ranging. Int. J. of Robotics
Research, 15(5):441–458, Oct. 1996.
[BA00] G.A. Borges and M.J. Aldon. A split-and-merge segmentation algorithm for line
extraction in 2-D range images. In Proc. 15th Int. Conf. on Pattern Recognition,
Barcelona, Sept. 2000.
158
[Bal81] D. Ballard. Generalizing the Hough transform to detect arbitrary shapes. Pat-
tern Recognition, 13(2):111–122, 1981.
[BEFW97] J. Borenstein, H.R. Everett, L. Feng, and D. Wehe. Mobile robot positioning:
Sensors and techniques. Journal of Robotic Systems, 14(4):231–249, 1997.
[CL85] R. Chatila and J. Laumond. Positioning reference and consistent world mod-
eling for mobile robots. In Proc. IEEE International Conference on Robotics
and Automation, pages 138–145, 1985.
[CMNT99] J.A. Castellanos, J.M.M. Montiel, J. Neira, and J.D. Tardos. The spmap: A
probabilistic framework for simultaneous localization and map building. IEEE
Trans. on Robotics and Automation, 15(5):948–952, 1999.
[Cro89] J.L. Crowley. World modeling and position estimation for a mobile robot using
ultrasonic ranging. In Proc. IEEE Int. Conf. on Robotics and Automation,
pages 674–680, 1989.
[CT99] J.A. Castellanos and J.D. Tardos. Mobile Robot Localization and Map Building:
A Multisensor Fusion Approach. Kluwer Academic Publishers, 1999.
[DH72] R.O. Duda and P.E. Hart. Use of Hough transform to detect lines and curves
in pictures. Communications of the ACM, 15(1):11–15, 1972.
159
[DSTT01] F. Dellaert, S. Seitz, C. Thorpe, and S. Thrun. Feature correspondence: A
markov chain monte carlo approach. In Advances in Neural Information Pro-
cessing Systems 13 (NIPS 2000), 2001.
[Eng94] S. Engelson. Passive Map Learning and Visual Place Recognition. PhD thesis,
Dept. of Computer Science, Yale University, New Haven, CT, 1994.
[FLW95] J. Forsberg, U. Larsson, and A. Wernersson. Mobile robot navigation using the
range-weighted Hough transform. IEEE Robotics and Automation Magazine,
pages 18–26, March 1995.
[FP02] D.A. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice
hall, Saddle River, NJ, 1st ed. edition, 2002.
[GG97] J. Gonzalez and R. Gutierrez. Mobile robot motion estimation from a range
scan sequence. In Proc. IEEE Int. Conf. on Robotics and Automation, pages
1034–9, New York, NY, Apr. 20–25 1997.
[GMR98] J. Gomes-Mota and M.I. Ribeiro. Localisation of a mobile robot using a laser
scanner on reconstructed 3d models. In Proc. 3rd Portuguese Conf. on Auto-
matic Control, pages 667–672, Coimbra, Portugal, Sept. 1998.
[IK87] J. Illingworth and J. Kittler. The adaptive Hough transform. IEEE Trans. on
Pattern Analysis and Machine Intelligence, 9(5):690–697, 1987.
[IN99] L. Iocchi and D. Nardi. Hough transform based localization of mobile robots.
In Proc. IMACS/IEEE Int. Conf. Circuits, Systs., Comm., Computers, 1999.
[JC98] P. Jensfelt and H.I. Christensen. Laser based position acquisition and tracking
in an indoor environment. In Proc. Int. Symp. Robotics and Automation, 1998.
[JM00] A.E. Johnson and A. Miguel San Martin. Motion estimation from laser ranging
for autonomous comet landing. In Proc. IEEE Int. Conf. on Robotics and
Automation, pages 1788–1795, San Francisco, CA, Apr. 24–28 2000.
160
[KB01] T. Kadir and M. Brady. Saliency, scale and image description. International
Journal of Computer Vision, 45(2):83–105, 2001.
[LM97a] F. Lu and E. Milios. Globally consistent range scan alignment for environment
mapping. Autonomous Robots, 4:333–349, 1997.
[Low99] D.G. Lowe. Object recognition from local scale-invariant features. In Proc. of
the International Conference on Computer Vision, Corfu, Sept. 1999.
[LR02] Y. Liu and M.A. Rodriques. Accurate registration of structured data using two
overlapping range images. In Proc. IEEE Int. Conf. on Robotics and Automa-
tion, pages 2519–24, Washington D.C., May 11-15 2002.
[MM92] Y. Mesaki and I. Masuda. A new mobile robot guidance system using optical
reflectors. In Proc. Intelligent Robots and Systems, volume 1, pages 628 – 635,
July 1992.
[MO00] E. Magli and G. Olmo. Integrated compression and linear feature detection in
the wavelet domain. In Proc. Int. Conf. Image Processing, Vancouver, Canada,
Sept. 2000.
161
[MO01] E. Magli and G. Olmo. On high resolution positioning of straight patterns
via multi-scale matched filtering of the Hough transform. Pattern Recognition
Letters, 22(6/7):705–713, 2001.
[NTHS99] J. Neira, J.D. Tardós, J. Horn, and G. Schmidt. Fusing range and intensity
images for mobile robot localization. IEEE Trans. on Robotics and Automation,
15(1):76–84, Feb. 1999.
[NW00] J. Nygards and A. Wernersson. Model based fusion of laser and camera: range
discontinuities and motion consistency. In Proc. Int. Conf. on Information
Fusion, Paris, France, Jul. 10-13 2000.
[P+ 92] W.H. Press et al. Numerical Recipes in C: The art of scientific computing.
Cambridge Univ. Press, Cambridge, 2nd ed. edition, 1992.
[Pfi02] S.T. Pfister. Weighted line fitting and merging. Technical re-
port, California Institute of Technology, 2002. Available at:
http://robotics.caltech.edu/∼ sam/TechReports/LineFit/linefit.pdf.
[PKRB02] S.T. Pfister, K.L. Kriechbaum, S.I. Roumeliotis, and J.W. Burdick. Weighted
range sensor matching algorithms for mobile robot displacement estimation. In
Proc. IEEE Int. Conf. on Robotics and Automation, Washington, D.C., May
2002.
162
[PM90] P. Perona and J. Malik. Scale-space and edge detection using anisotropic dif-
fusion. IEEE Trans. on Pattern Analysis and Machine Intelligence, 12(7):629–
639, July 1990.
[PR98] D.K. Pai and L.M. Reissell. Multiresolution rough terrain motion planning.
IEEE Trans. on Robotics and Automation, pages 19–33, Feb. 1998.
[PRB03] S.T. Pfister, S.I. Roumeliotis, and J.W. Burdick. Weighted line fitting algo-
rithms for mobile robot map building and efficient data representation. In Proc.
IEEE Int. Conf. on Robotics and Automation, Taipei, Taiwan, Sept. 2003.
[RB00a] S.I. Roumeliotis and G.A. Bekey. Bayesian estimation and Kalman filtering:
A unified framework for mobile robot localization. In Proc. IEEE Int. Conf.
on Robotics and Automation, pages 2985–2992, San Fransisco, CA, Apr. 24–28
2000.
[RB00b] S.I. Roumeliotis and G.A. Bekey. SEGMENTS: A layered, dual-Kalman filter
algorithm for indoor feature extraction. In Proc. IEEE/RSJ Int. Conf. on
Intelligent Robots and Systems, pages 454–461, Takmatsu, Japan, 2000.
[RB02] S.I. Roumeliotis and J.W. Burdick. Stochastic cloning: A generalized frame-
work for processing relative state measurements. In Proc. IEEE Int. Conf.
on Robotics and Automation, pages 1788–1795, Washington D.C., May 11-15
2002.
[SC86] R.C. Smith and P. Cheeseman. On the representation and estimation of spatial
uncertainty. Int. J. of Robotics Research, 5(4):56–68, 1986.
163
[SNC97] P.W. Smith, N. Nandhakumar, and C.H. Chien. Object motion and structure
recovery for robotic vision using scanning laser range sensors. IEEE Trans. on
Robotics and Automation, 13(1):74–80, Feb. 1997.
[TG02] S. Takahashi and B.K. Ghosh. Motion and shape identification with vision and
range. IEEE Trans. on Robotics and Automation, 47(8):1392–6, Aug. 2002.
[TLK+ 04] S. Thurn, Y.F. Liu, D. Koller, Z. Ghahramai, and H. Durrant-Whyte. Simulta-
neous localization and mapping with sparse extended information filters. Int.
J. Robotics Research, pages 693–716, Jul-Aug 2004.
[VK99] N. Vlassis and B. Krose. Robot environment modeling via principal component
regression. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 1999.
Appendix A
Recall the log-likelihood formula of Eq. (3.14). Since D ij is independent of xij and yij ,
the necessary condition for an extremal in the log-likelihood function with respect to the
variable pij = [ xij yij ]T is
∇pij (M ij ) = 0 ⇔
nij
X
∇pij (εij
k ) T
(P ij −1 ij
k ) ε k = 0 ⇔
k=1
nij h i
X
2 ∇pij (εij
k ) T
(P ij −1 ij
k ) ε k = 0 ⇔
k=1
nij h i
X
−2 I (Pkij )−1 εij
k = 0 ⇔
k=1
nij h i
X
(Pkij )−1 (uik − Rij ujk − pij ) = 0.
k=1
∂M ij (φ)
= 0. (A.1)
∂φ
nij
ij 1 X ij T ij −1 ij
M = (εk ) (Pk ) εk .
2
k=1
Here we derive an exact expression for the quantity M ij as a function of δφ. From the
Taylor series expansion for the functions sin and cos we have
1 1
cos φ = cos φb − sin φb δφ − cos φb δφ2 + ...
1! 2!
1 1
sin φ = sin φb + cos φb δφ − sin φb δφ2 − ....
1! 2!
defined in Eq. (3.4), can be described as a function of the orientation error δφ:
εij
k = uik − pij − Rij ujk (A.4)
1
= uik − pij − R̂ij ujk − J R̂ij ujk δφ
1!
1
+ R̂ij ujk δφ2 + ....
2!
The covariance matrix for the matching error at the k th point correspondence of poses i
167
and j in Eq. (3.9) can also be described as a function of δφ:
where
The inverse Ikij (δφ) = (Pkij (δφ))−1 of the covariance matrix can be computed using Taylor
series expansion as
with
ij(n) ∂ n (I ij (δφ))
Ik (0) = ,
∂(δφ)n
δφ=0
where
ij(0)
Ik (0) = (Pkij (0))−1 = (Pkij )−1 = (Qij eij −1
k + Sk )
ij(1)
Ik (0) = −(Qij eij −1 eij eij ij eij −1
k + Sk ) (J Sk − Sk J)(Qk + Sk )
ij(2) ij(1) ij(1)
Ik (0) = 2Ik (0)Pkij (0)Ik (0) + 2(Sekij + J Sekij J).
168
By substituting from Eq.s (A.4), (A.6) to Eq. (3.12) we have
nij
1 X T ij
M ij = {pk Ik (0)pk
2
k=1
h i
ij(1)
+ −2pTk Ikij (0)Jqk + pTk Ik (0)pk δφ
h
+ pTk Ikij (0)qk − qkT JIkij (0)Jqk
ij(1) 1 T ij(2)
− 2pTk Ik (0)Jqk + pk Ik (0)pk δφ2
2
+ ... }, (A.7)
where
Note that there has been no approximation made up to this point. Eq. (A.7) is a complete
expression of the cost function Mij , expressed as an infinite series of terms polynomial in
the orientation estimation error δφ. In order to minimize this function, we approximate
it after considering a limited number of terms. For small errors in the initial orientation
estimate (δφ < π/6), a second-order approximation is sufficient when a large number of
point correspondences are available. Higher-order approximations are necessary as the
number of point correspondences decreases.
By substituting Eq. (A.7) in Eq. (A.2) and employing Eq. (A.10) 1 we derive the
expression for the orientation displacement error of Eq. (3.17).
Here we consider the estimation problem where n ij measurements Z = [Z1T ... ZnTij ]T with
Zk = [(uik )T (ujk )T ]T are processed to derive an estimate of a vector λ of the motion param-
1
Eq. (A.10) expresses the fact that the point correspondence errors are very small compared to the
distances to these points.
169
eters
p̂ij hp (Z)
λ̂ = = = h(Z), (A.11)
φ̂ij hφ (Z)
with the expressions for functions h p and hφ given by Eqs. (3.15) and (3.17). A first-order
approximation of the error in the estimate of the parameter vector λ̂ is given by
nij
X
ελb = ∇TZ h(Z) εZ = ∇TZk h(Zk ) εZk , (A.12)
k=1
with
h i
∇TZ h(Z) = ∇TZ1 h(Z) ... ∇TZn h(Z) (A.13)
ij
and
∇TZk hp (Z)
∇TZk h(Z) = . (A.14)
∇TZk hφ (Z)
Note that
b is
The covariance of the estimate λ
where
P Z1 . 0
PZ = E{εZ εTZ } = . . (A.16)
0 . PZnij
170
and
δuik h i
P Zk = E{εZk εTZk } = E{ (δuik )T (δujk )T }
δujk
Qij
k 0
= . (A.17)
0 Skij
nij
X
Pλb = ∇TZk h(Z) PZk ∇Zk hT (Z)
k=1
nij
X ∇TZk hp (Z)
= PZk ∇Zk hTp (Z) ∇Zk hTφ (Z)
k=1∇TZk hφ (Z)
Ppp Ppφ
= . (A.18)
Pφp Pφφ
nij
X
Pξζ = ∇TZk hξ (Z)PZk ∇Zk hTζ (Z) (A.19)
k=1
nij
X
= (∇Tui hξ ) Qij T
k (∇ui hζ )
k k
k=1
+ (∇Tuj hξ ) Skij (∇uj hTζ ) ,
k k
nij !−1
X
∇Tui hp = ij −1
(Pm ) (Pkij )−1 , (A.20)
k
m=1
nij !−1
X
∇ uj h p = − ij −1
(Pm ) (Pkij )−1 R̂ij , (A.21)
k
m=1
1
∇Tui hφ ' − qk J(Pkij )−1 , (A.22)
k rT
1
∇Tui hφ ' − qk J(Pkij )−1 R̂ij , (A.23)
k rT
with
Pkij = Qij ij T
k + R̂ij Sk R̂ij ,
qk = R̂ij ujk ,
nij
X
rT = − qkT J(Pkij )−1 Jqk .
k=1
In Eqs. (A.22), (A.23) we employed the approximation made in Eq. (A.10). The interested
reader is referred to [Rou01] for the details of these derivations.
By substituting Eqs. (A.20) to (A.23) in Eq. (A.19) the submatrices of the covariance
matrix for the estimated motion vector λ̂T = [ p̂Tij φ̂ij ] in Eq. (A.18) can now be computed.
The final expressions are given by Eqs. (3.16)–(3.21).
172
Appendix B
Given a set of measured points in polar form {( dˆk , φ̂k )}, k = 1...n we wish to estimate the
optimal infinite line L = (α, ρ) in polar form where α is the orientation of the line and ρ is the
normal position of the line. We define a virtual measurement δˆk to be the distance between
the k th point and the line L, and we minimize the total χ 2 cost of the virtual measurements
to calculate estimates for α and ρ. As discussed in Section 2.2, the minimization of the χ 2
cost is an equivalent calculation to the maximum likelihood approach.
Note that these derivations are carried out using the raw polar form of the range scan
point, and assume the general noise model outlined in Section 2.5.2. See [Pfi02] for a similar
derivation with Cartesian points and a generalized point uncertainty.
Let
εdk = dˆk − dk , (B.1)
where εdk is the error of the measurement dˆk with respect to the “true” point distance d k .
Similarly
εφk = φ̂k − φk . (B.2)
The virtual measurement representing the distance from the k th point to the line L = (α, ρ)
with no error is defined as
δk = dk cos(α − φk ) − ρ
= 0. (B.3)
173
For small εφk , εdk we use the approximations
In practice, when calculating Pδk , we can use the estimated values of φ̂k and dˆk in place of
the “true” values of φk and dk , which we would not know exactly. We can therefore rewrite
the covariance for the virtual measurement P δk as
This section derives the calculation of ψ P defined as the position along a candidate line
estimate L about which the combined rotational contributions of the virtual measurements
is zero. Pn ψˆk
k=1 Pδρ
k
ψP = Pn 1 , (B.10)
k=1 Pδρk
n
X
2 (δ̂k )2
χ (L) =
P δk
k=1
Xn
(dˆk cos(α − φ̂k ) − ρ)2
= , (B.12)
k=1
σd2k cos2 (α − φ̂k ) + σ 2 dˆ2 sin2 (α − φ̂k )
φk k
where
ρ
L= (B.13)
α
Given an estimate of the heading to the line α̂ , Eq. (B.12) can be written in terms of the
unknown ρ:
n
X n
X (δk )2
2 (dˆk cos(α̂ − φ̂k ) − ρ)2
χ (ρ) = = (B.14)
k=1
σd2k cos2 (α̂ − φ̂k ) + σ 2 dˆ2 sin2 (α̂ − φ̂k )
φk k
P δk
k=1
175
In order to minimize Eq. (B.14) we have to set
∂χ2 (ρ)
= 0⇔
∂(ρ)
n
X (−2)(dˆk cos(α̂ − φ̂k ) − ρ)
= 0⇔
P δk
k=1
Xn n
dˆk cos(α̂ − φ̂k ) X ρ
− = 0⇔
P δk P δk
k=1 k=1
n ˆ n
!
X dk cos(α̂ − φ̂k ) X 1
= ρ
P δk P δk
k=1 k=1
or
Pn dˆk cos(α̂−φ̂k )
k=1 P δk
ρ= Pn 1 , (B.15)
k=1 Pδ
k
Pδk = σd2k cos2 (α̂ − φ̂k ) + σφ2 k dˆ2k sin2 (α̂ − φ̂k ). (B.16)
Given an estimate of the distance to the line ρ̂ , Eq. (B.12) can be written in terms of the
unknown heading to line parameter α:
n
X X (δk )2 n
2 (dˆk cos(α − φ̂k ) − ρ̂)2
χ (α) = = . (B.17)
k=1
σd2k cos2 (α − φ̂k ) + σ 2 dˆ2 sin2 (α − φ̂k )
φk k
P δk
k=1
∂χ2 (α)
= 0. (B.18)
∂α
Instead of directly computing the gradient of χ 2 (α) with respect to α, we will calculate it
as follows:
so
n
X
χ2 (α) = χ2 (b
α + δα) = Gk (δα), (B.21)
k=0
with
(dˆk cos(b
α + δα − φ̂k ) − ρ̂)2
Gk (δα) = . (B.22)
σd2k α + δα − φ̂k ) + σ 2 dˆ2 sin2 (b
cos2 (b φk k α + δα − φ̂k )
1 0 1 00 1 000
Gk (δα) = Gk (0) + Gk (0)δα + Gk (0)δα2 + Gk (0)δα3 + ... (B.23)
1! 2! 3!
and let
ck = cos(b
α + δα − φ̂k ),
sk = sin(b
α + δα − φ̂k ) ,
so
ak (0)
Gk (0) = (B.25)
bk (0)
and
0 0
0 bk (0)ak (0) − ak (0)bk (0)
Gk (0) = (B.26)
(bk (0))2
177
and
0 0 00 0 00
00 −2ak (0)bk (0)(bk (0))2 + ak (0)(bk (0))3 + 2ak (0)bk (0)(bk (0))2 − ak (0)(bk (0))2 bk (0)
Gk (0) =
(bk (0))4
00 00
0 0
0
ak (0)bk (0) − ak (0)bk (0) bk (0) − 2 ak (0)bk (0) − ak (0)bk (0) bk (0)
= , (B.27)
(bk (0))3
so
n
X
2 1 0 1 00 1 000
χ (δα) = {Gk (0) + Gk (0)δα + Gk (0)δα2 + Gk (0)δα3 + ...} (B.28)
1! 2! 3!
k=1
Note: There is no approximation made up to this point. The previous equation is the
complete analytical expression of the cost function. It is expressed as an infinite series of
polynomial terms of the orientation estimation error δα. In order to minimize this function
we have to approximate it after considering a limited number of terms.
Finally, by substituting the previous expression with Eq. (B.26) and Eq. (B.27) in Eq.
(B.18) and solving for δα we have
Pn 0
k=1 Gk (0)
δα = − Pn 00 . (B.30)
k=1 Gk (0)
Appendix C
with distance to line covariance Pρρ , heading to line covariance Pαα and cross-correlation
covariance terms Pρα = Pαρ derived in the following sections.
Note that these derivations are carried out using the raw polar form of the range scan
point, and assume the general noise model outlined in Section 2.5.2. See [Pfi02] for a similar
derivation with Cartesian points and a generalized point uncertainty.
where
Y1
.
Y =
.
Yn
179
and
dˆk
Yk = , (C.3)
φ̂k
with
dˆk h i σd2k 0
PYk Yk = E{Yk TYk } = E{ dˆk φ̂k }= . (C.4)
φ̂k 0 σφ2 k
and therefore
cos(α̂−φ̂k )
Pδ
∇dˆk gρ = Pn k 1 , (C.6)
j=1 Pδj
dˆk sin(α̂−φ̂k )
Pδ
∇φ̂k gρ = Pn k 1 , (C.7)
k=j Pδj
180
so
n
X
Pρρ = ∇TYk gρ PYk Yk (∇Yk gρ )
k=1
∇dˆk gρ
n h
X i σd2 0
= ∇dˆk gρ | ∇φ̂k gρ
−
k=1 0 σφ2
∇φ̂k gρ
n
X n
X
= (∇dˆk gρ )2 σd2 + (∇φ̂k gρ )2 σφ2
k=1 k=1
2 2
n cos(α̂−φ̂k ) n dˆk sin(α̂−φ̂k )
X Pδ 2 X Pδ 2
= Pn k 1 σd + Pn k 1 σφ
k=1 k=1 Pδk k=1 k=1 Pδk
cos2 (α̂−φ̂k ) dˆ2k sin2 (α̂−φ̂k )
n
X (Pδ )2 2 (Pδ )2 2
= P k 2 σd + P k 2 σφ
n 1 n 1
k=1 k=1 Pδ k=1 Pδ
k k
Pn σd2 cos2 (α̂−φ̂ 2 ˆ2 sin2 (α̂−φ̂
k )+σφ dk k)
k k
k=1 (Pδk )2
= P 2
n 1
k=1 Pδk
Pn 1
k=1 Pδ
k
= P 2
n 1
k=1 Pδ
k
1
= P (C.8)
n 1
k=1 Pδ
k
dˆk
Yk = , (C.9)
φ̂k
Pαα = E{α Tα }
T
= E{ ∇TY gα Y ∇TY gα Y }
T
= ∇TY gα E{Y TY } ∇TY gα
= ∇TY gα PY Y (∇Y gα ) . (C.10)
181
From Eqs. (B.26), (B.27), (B.30),
gα = α = α
b + δα
Pn 0
k=1 Gk (0)
b − Pn
= α 00
k=1 Gk (0)
Pn 0
bk ak −ak bk
0
k=1 0
(bk )2
= α
b− P , (C.11)
n (a00k bk −ak b00k )bk −2(a0k bk −ak b0k )b0k
k=1 (bk )3
where
and α
b is a constant (the current estimate of orientation computed in the last step of the
ML algorithm) so
P P P P
n 0 n 00 n 0 n 00
∇dˆk j=1 Gj (0) j=1 Gj (0) − j=1 Gj (0) ∇dˆk j=1 Gj (0)
∇dˆk gα = − P 2
n 00
j=1 Gj (0)
P P
0 n 00 n 0 00
−∇dˆk (Gk (0)) j=1 Gj (0) + j=1 Gj (0) ∇dˆk (Gk (0))
= P 2
n 00
j=1 Gj (0)
0
1 0 G 00
= − 00 ∇dˆk (Gk (0)) + 00T 2 ∇dˆk (Gk (0)), (C.14)
GT (GT )
where
n
X n
X
0 0 00 00
GT = Gj (0) , GT = Gj (0).
j=1 j=1
Similarly,
0
1 0 GT 00
∇φ̂k gα = − 00 ∇φ̂k (Gk (0)) + 00 2 ∇φ̂ (Gk (0)) (C.15)
GT (GT ) k
182
and From Eq. (C.10), substituting from Eqs. (C.4), (C.14), (C.15):
n
X
Pαα = ∇TYk gα PYk Yk (∇Yk gα )
k=1
∇dˆk gα
n h
X i σd2 0
= ∇dˆk gα | ∇φ̂k gα
−
k=1 0 σφ2
∇φ̂k gα
n
X n
X
= (∇dˆk gα )2 σd2 + (∇φ̂k gα )2 σφ2
k=1 k=1
n
!2
X 1 0 GT 00
0
0 00
C.2.1 Complete Gk (0) Gk (0)
0 0
0 ba − ab
Gk (0) =
b2
and 0 0
00 00 0
00
a b − ab b − 2 a b − ab b
Gk (0) = ,
b3
with
c = cos(b
α − φ),
s = sin(b
α − φ),
a = (dc − ρ̂)2 ,
0 ∂a(δα)
a = , = −2ds(dc − ρ̂)
∂δα
00 ∂ 2 a(δα)
a = = 2d2 s2 − 2dc(dc − ρ̂),
(∂δα)2
b = σd2k c2 + σφ2 k d2 s2 ,
0 ∂b(δα)
b = = 2(d2 σφ2 k − σd2k )cs,
∂δα
00 ∂ 2 b(δα)
b = = 2(d2 σφ2 k − σd2k )(c2 − s2 ), (C.17)
(∂δα)2
183
we can calculate
0
0 ∂(G (0))
∇d (G (0)) =
∂d0 0
∂ ba b−ab
2
=
∂d
0 0 0 0 0 0
(a bd + ad b − b ad − bd a)b − (a b − b a)2bd
= , (C.18)
b3
0
0 ∂(G (0))
∇φ (G (0)) =
∂φ
0 0
∂ ba b−ab
2
=
∂φ
0 0 0 0 0 0
(a bφ + aφ b − b aφ − bφ a)b − (a b − b a)2bφ
= , (C.19)
b3
00
00 ∂(G (0))
∇d (G (0)) =
∂d
“ 00 00
” “ 0 0
” 0!
a b−ab b−2 a b−ak bk b
∂ b3
= , (C.20)
∂d
00
00 ∂(G (0))
∇d (G (0)) =
∂φ
“ 00 00
” “ 0 0
” 0!
a b−ab b−2 a b−ak bk b
∂ b3
= , (C.21)
∂φ
0 00
C.2.2 Approximate Gk (0) Gk (0)
Assume small errors such that |δ| << |r|, i.e., the distance from a point to the line is small
compared to the distance from that point to the origin, where
|δ| = |r cos(α − φ) − ρ|
184
dˆk cos φ̂k
~ k = k
|r| = kV k
dˆk sin φ̂k
given
0
ad >> a,
0 0
ad >> a ,
0
ad >> ad ,
so
0 0 0 0 0 0
0 (a bd + ad b − b ad − bd a)b − (a b − b a)2bd
∇d (G (0)) =
b3
0 2
ad b
'
b3
0
ad
' (C.30)
b
185
with
Similarly,
0
aφ >> a,
0 0
aφ >> a ,
0
aφ >> aφ
so
0
0 aφ
∇φ (G (0)) ' (C.32)
b
with
and 0 0
00 00 0
00
a b − ab b − 2 a b − ab b
Gk (0) =
b3
186
Consider Eqs. (C.22)–(C.29) and
00
a = −2dc(dc − ρ̂) + 2d2 s2 = −2dcδ + 2d2 s2 ∼ O(d2 ), (C.34)
00
b = 2(d2 σφ2 k − σd2k )(c2 − s2 ) ∼ O(d2 ), (C.35)
0 O(d3 δ) − O(d2 δ 2 )
Gk (0) ∼ ∼ O(δ/d), (C.36)
O(d4 )
00 O(d4 ) − O(d2 δ 2 ) O(d2 ) − 2 O(d3 δ) − O(d2 δ 2 ) O(d2 )
Gk (0) ∼ (C.37)
O(d6 )
∼ O(d/d) (C.38)
so
00 0 00 0
Gk (0) >> Gk (0) ⇒ GT >> GT
0
GT
⇒ 00 ' 0, (C.39)
(GT )2
so from Eqs. (C.30), (C.32), (C.33), and (C.39) we can approximate Eq. (C.16) as
n
X 2
1 0
Pαα ' − 00 ∇dˆk (Gk (0)) σd2
k=1
GT
X n 2
1 0
+ − 00 ∇φ̂k (Gk (0)) σφ2
k=1
GT
n
!2 n 0 !2
1 X X
0
adi 1 a φi
= 00 2 − σd2 + 00 2 − σφ2
(GT ) k=1 bk (GT ) k=1 bk
n
!2
1 X 2dˆk cos(α − φ̂k ) sin(α − φ̂k )
= 00 σd2
(GT )2 k=1 bk
n
!2
1 X −2d̂2k sin2 (α − φ̂k )
+ 00 2 − σφ2
(GT ) bk
k=1
n
!
1 X 4dˆ2k sin2 (α − φ̂k ) 2
= (σd cos2 (α − φ̂k ) + σφ2 dˆ2k sin2 (α − φ̂k )) .
b2k
00
(GT )2 k=1
187
Use definition of bk (Eq. (C.13)) to get
n
!
1 X 4dˆ2k sin2 (α − φ̂k )
Pαα ' (bk )
b2k
00
(GT )2 k=1
n
!
1 X 4dˆ2k sin2 (α − φ̂k )
= 00
(GT )2 k=1 bk
n
!
1 X 4dˆ2k sin2 (α − φ̂k )
= .
(σ 2 cos2 (α − φ̂k ) + σ 2 dˆ2 sin2 (α − φ̂k ))
00
(GT )2 k=1 d φ k
n
X
Pρα = ∇TYk gα PYk Yk (∇Yk gρ )
k=1
∇dˆk gρ
n h
X i σd2 0
= ∇dˆk gα | ∇φ̂k gα
−
k=1 0 σφ2
∇φ̂k gρ
n
X n
X
= (∇dˆk gα )σd2 (∇dˆk gρ ) + (∇φ̂k gα )σφ2 (∇φ̂k gρ )
k=1 k=1
0 0
n a cos(α−φ̂k ) n aφ dˆk sin(α−φ̂k )
X −( bkd ) b
X −( bk ) b
= 00 σd2 Pn k 1 + 00
2
σφ Pn k 1
k=1
GT j=1 bj k=1
GT k=j bj
n
! !
1 X 2dˆk cos(α − φ̂k ) sin(α − φ̂k ) cos(α − φ̂ k )
= P [ σd2
GT
00 n 1 bk bk
k=j bj k=1
! !
2dˆ2k sin2 (α − φ̂k ) 2 dˆk sin(α − φ̂k )
+ σφ ]
bk bk
" ! #
2dˆk sin(α − φ̂k ) 2
n
X
1
= P σd cos (α − φ̂k ) + σφ dˆk sin (α − φ̂k )
2 2 2 2
GT
00 n 1 b2k
k=j bj k=1
n
" #
1 X 2dˆk sin(α − φ̂k )bk
= P
GT
00 n 1 b2k
k=j bj k=1
n
" #
1 X 2dˆk sin(α − φ̂k )
= P
GT
00 n 1 bk
k=j bj k=1