Robotics Research The International Journal Of: Bounding On Rough Terrain With The Littledog Robot

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

The International Journal of

Robotics Research http://ijr.sagepub.com/

Bounding on Rough Terrain with the LittleDog Robot


Alexander Shkolnik, Michael Levashov, Ian R. Manchester and Russ Tedrake
The International Journal of Robotics Research published online 7 December 2010
DOI: 10.1177/0278364910388315

The online version of this article can be found at:


http://ijr.sagepub.com/content/early/2010/11/16/0278364910388315

Published by:

http://www.sagepublications.com

On behalf of:

Multimedia Archives

Additional services and information for The International Journal of Robotics Research can be found at:

Email Alerts: http://ijr.sagepub.com/cgi/alerts

Subscriptions: http://ijr.sagepub.com/subscriptions

Reprints: http://www.sagepub.com/journalsReprints.nav

Permissions: http://www.sagepub.com/journalsPermissions.nav

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


The International Journal of
Robotics Research
Bounding on Rough Terrain with the 00(000) 1–24
© The Author(s) 2010
LittleDog Robot Reprints and permission:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0278364910388315
ijr.sagepub.com

Alexander Shkolnik, Michael Levashov, Ian R. Manchester and Russ Tedrake

Abstract
A motion planning algorithm is described for bounding over rough terrain with the LittleDog robot. Unlike walking gaits,
bounding is highly dynamic and cannot be planned with quasi-steady approximations. LittleDog is modeled as a planar
five-link system, with a 16-dimensional state space; computing a plan over rough terrain in this high-dimensional state
space that respects the kinodynamic constraints due to underactuation and motor limits is extremely challenging. Rapidly
Exploring Random Trees (RRTs) are known for fast kinematic path planning in high-dimensional configuration spaces in
the presence of obstacles, but search efficiency degrades rapidly with the addition of challenging dynamics. A computa-
tionally tractable planner for bounding was developed by modifying the RRT algorithm by using: (1) motion primitives to
reduce the dimensionality of the problem; (2) Reachability Guidance, which dynamically changes the sampling distribu-
tion and distance metric to address differential constraints and discontinuous motion primitive dynamics; and (3) sampling
with a Voronoi bias in a lower-dimensional “task space” for bounding. Short trajectories were demonstrated to work on
the robot, however open-loop bounding is inherently unstable. A feedback controller based on transverse linearization was
implemented, and shown in simulation to stabilize perturbations in the presence of noise and time delays.

Keywords
Legged locomotion, rough terrain, bounding, motion planning, LittleDog, RRT, reachability-guided RRT, transverse
linearization

1. Introduction
While many successful approaches to dynamic legged loco-
motion exist, we do not yet have approaches which are
general and flexible enough to cope with the incredible
variety of terrain traversed by animals. Progress in motion
planning algorithms has enabled general and flexible solu-
tions for slowly moving robots, but we believe that in order
to quickly and efficiently traverse very difficult terrain,
Fig. 1. LittleDog robot, and a corresponding five-link planar
extending these algorithms to dynamic gaits is essential.
model.
In this work we present progress towards achieving agile
locomotion over rough terrain using the LittleDog robot.
LittleDog (Figure 1) is a small, 3-kg position-controlled stereotyped dynamic maneuvers that relied on an intermit-
quadruped robot with point feet and was developed by tent existence of a support polygon to regain control and
Boston Dynamics under the DARPA Learning Locomo- simplify planning (Byl et al. 2008). In this paper, we present
tion program. The program ran over several phases from a method for generating a continuous dynamic bounding
2006 to 2009, and challenged six teams from universities gait over rough terrain.
around the United States to compete in developing algo-
rithms that enable LittleDog to navigate known, uneven
Computer Science and Artificial Intelligence Lab, MIT, Cambridge, MA,
terrain, as quickly as possible. The program was very suc-
USA
cessful, with many teams demonstrating robust planning
Corresponding author:
and locomotion over quite challenging terrain (e.g. Pongas Alexander Shkolnik, Computer Science and Artificial Intelligence Lab,
et al. 2007; Rebula et al. 2007; Kolter et al. 2008; Zucker MIT, 32 Vassar Street, Cambridge, MA 02139, USA.
2009), with an emphasis on walking gaits, and some short Email: shkolnik@csail.mit.edu

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


2 The International Journal of Robotics Research 00(000)

Achieving bounding on LittleDog is difficult for a num-


ber of reasons. First, the robot is mechanically limited:
high-gear ratio transmissions generally provide sufficient
torque but severely limit joint velocities and complicate
any attempts at direct torque control. Second, a stiff frame
complicates impact modeling and provides essentially no
opportunity for energy storage. Finally, and more gener-
ally, the robot is underactuated, with the dynamics of the
unactuated joints resulting in a complicated dynamical rela-
tionship between the actuated joints and the interactions
with the ground. These effects are dominant enough that
they must be considered during the planning phase.
In this work, we propose a modified form of the Rapidly
Exploring Random Tree (RRT) (LaValle and Kuffner 2001)
planning framework to quickly find feasible motion plans
for bounding over rough terrain. The principal advan-
tage of the RRT is that it respects the kinematic and
dynamic constraints which exist in the system, however for
high-dimensional robots the planning can be prohibitively
slow. We highlight new sampling approaches that improve
the RRT efficiency. The dimensionality of the system is Fig. 2. Dog bounding up stairs. Images from video available at
addressed by biasing the search in a low-dimensional task http://www.flickr.com/photos/istolethetv/3321159490/
space. A second strategy uses reachability guidance as a
heuristic to encourage the RRT to explore in directions that experimental results. Section 6 concludes the paper, and
are most likely to successfully expand the tree into previ- discusses remaining open problems.
ously unexplored regions of state space. This allows the
RRT to incorporate smooth motion primitives, and quickly 2. Background
find plans despite challenging differential constraints intro-
duced by the robot’s underactuated dynamics. This planner The problem of fast locomotion over rough terrain has been
operates on a carefully designed model of the robot dynam- an active research topic in robotics, beginning with the sem-
ics which includes the subtleties of motor saturations and inal work by Raibert in the 1980s (Raibert 1986; Raibert
ground interactions. et al. 1986). The research can be roughly divided into two
Bounding motions over rough terrain are typically not categories. The first category uses knowledge of the robot
stable in open loop: small disturbances away from the nom- and environment within a motion planning framework. This
inal trajectory or inaccuracies in physical parameters used approach is capable of versatile behavior over rough terrain,
for planning can cause the robot to fall over. To achieve reli- but motion plans tend to be slow and conservative. The
able bounding it is essential to design a feedback controller other category is characterized by a limit-cycle approach,
to robustly stabilize the planned motion. This stabilization which is usually blind to the environment. In this approach,
problem is challenging due to the severe underactuation of more dynamic motions may be considered, but typically
the robot and the highly dynamic nature of the planned tra- over only a limited range of behavior. In this section we
jectories. We implemented a feedback controller based on review these approaches in turn.
the method of transverse linearization which has recently
emerged as an enabling technology for this type of con-
2.1. Planning Approaches
trol problem (Hauser and Chung 1994; Shiriaev et al. 2008;
Manchester et al. 2009; Manchester 2010). Planning algorithms have made significant headway in
The remainder of this paper is organized as follows: in recent years. These methods are particularly well devel-
Section 2 we begin by reviewing background information, oped for kinematic path planning in configuration space,
including alternative approaches to achieve legged loco- focusing on maneuvers requiring dexterity, obstacle avoid-
motion over rough terrain with particular attention given ance, and static stability. Sampling-based methods such as
to motion planning approaches. In Section 3 we present the RRT are very effective in planning in high-dimensional
a comprehensive physics-based model of the LittleDog humanoid configuration spaces. The RRT has been used
robot and the estimation of its parameters from experi- to plan walking and grasping trajectories amidst obsta-
ments. Section 4 discusses motion planning for bounding, cles by searching for a collision-free path in configura-
with a detailed description of the technical approach tion space, while constraining configurations to those that
and experimental results. In Section 5, we describe the are statically stable (Kuffner et al. 2002, 2003). The robot
feedback control design and show some simulation and is statically stable when the center of mass (COM) is

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


Shkolnik et al. 3

directly above the support polygon, therefore guarantee- and walking over varied terrain with the HRP-2 humanoid
ing that the robot will not roll over as long as the motion robot (Hauser et al. 2008; Hauser 2008). In that work,
is executed slowly enough. After finding a statically feasi- contact points and equilibrium (static) stances, acting as
ble trajectory of configurations (initially ignoring velocity), way points, were chosen by using a Probabilistic Road
the trajectory is locally optimized for speed and smooth- Map (Kavraki et al. 1996), an alternative sampling-based
ness, while maintaining the constraint that at least one foot planning strategy. The planner searched for paths through
remains flat on the ground at all times. This approach potentially feasible footholds and stances, while taking into
has been extended to account for moving obstacles and account contact and equilibrium constraints to ensure that
demonstrated on the Honda Asimo (Chestnutt et al. 2005). the robot maintains a foot or hand hold, and does not slip.
An alternative approach is to first generate a walking pat- Motion primitives were then used to find quasi-static local
tern while ignoring obstacles and collisions, and then use motion plans between stances that maintain the non-slip
random sampling to modify the gait to avoid obstacles constraints. With a similar goal of traversing very rough
while verifying constraints to ensure the robot does not terrain with legged robots, the DARPA Learning Locomo-
fall (Harada et al. 2007). Current methods are adept at tion project, utilizing the LittleDog robot, has pushed the
planning in high-dimensional configuration spaces, but typ- envelope of walking control by using careful foot place-
ically only for limited dynamic motions. Sampling-based ment. Much of the work developed in this program has
planning algorithms are in general not well suited for plan- combined path planning and motion primitives to enable
ning fast dynamic motions, which are governed largely by crawling gaits on rough terrain (e.g. Pongas et al. 2007;
underactuated dynamics. Rebula et al. 2007; Kolter et al. 2008; Ratliff et al. 2009).
The use of static stability for planning allows one to Similar to the approaches of the other teams, during the first
ignore velocities, which halves the size of the state space, two phases of the program, MIT utilized heuristics over the
and constrains the system to be fully actuated, which greatly terrain to assign a cost to potential footholds. A* search was
simplifies the planning problem. Statically stable motions then used to find a trajectory of feasible stances over the
are, however, inherently conservative (technically a robot is lowest cost footholds, and a ZMP-based body motion and
truly statically stable only when it is not moving). This con- swing-foot motion planner was used to generate trajectories
straint can be relaxed by using a dynamic stability criteria to walk over rough terrain.
(see Pratt and Tedrake (2005) for review of various metrics). Use of stability metrics such as ZMP have advanced the
These metrics can be used either for gait generation by the state of the art in planning gaits over rough terrain, how-
motion planner, or as part of a feedback control strategy. ever these stability constraints result in very conservative
One popular stability metric requires the center of pressure, dynamic trajectories, for example by requiring that at least
or the Zero Moment Point (ZMP), to be within the support one foot is always flat on the ground. Because ZMP-based
polygon defined by the convex hull of the feet contacts on control systems do not reason about underactuated dynam-
the ground. While the ZMP is regulated to remain within ics, the humanoid robots do not perform well when walking
the support polygon, the robot is guaranteed not to roll over on very rough or unmodeled terrain, cannot move nearly
any edge of the support polygon. In this case, the remain- as quickly as humans, and use dramatically more energy
ing degrees of freedom can be controlled as if the system is (appropriately scaled) than a human (Collins et al. 2005).
fully actuated using standard feedback control techniques Animals do not constrain themselves to such a regime of
applied to fully actuated systems. Such approaches have operation. Much more agile behavior takes place precisely
been successfully demonstrated for gait generation and exe- when the system is operating in an underactuated regime,
cution on humanoid platforms such as the Honda Asimo for example during an aerial phase, or while rolling the foot
(Sakagami et al. 2002; Hirose and Ogawa 2007), and the over the toe while walking. In such regimes, there is no
HRP series of walking robots (Kaneko et al. 2004). Lower- support polygon, so the legged robot is essentially falling
dimensional “lumped” models of the robot can be used and catching itself. Furthermore, underactuated dynamics
to simplify the differential equations that define ZMP. In might otherwise be exploited: for example, a humanoid
Kajita et al. (2003), the HRP-2 robot was modeled as a cart robot can walk faster and with longer strides by rolling the
(rolling point mass) on top of a table (with a small base foot over the toe.
that should not roll over). Preview control was then applied
to generate a COM trajectory which regulates the desired
ZMP position. Note that although the ZMP is only defined
2.2. Limit-cycle Approach
on flat terrain, some extensions can be applied to extend the Somewhat orthogonal to the planning approaches, a signif-
idea to 3D, including using hand contacts for stability, for icant body of research focuses on limit-cycle analysis for
example by considering the Contact Wrench Sum (CWS) walking. Tools developed for limit-cycle analysis allow one
(Sugihara 2004). to characterize the behavior of a particular gait, typically on
The works described so far were mostly demon- flat terrain. Stable limit-cycle locomotion can be achieved
strated on relatively flat terrain. Other quasi-static plan- by using compliant or mechanically clever designs that
ning approaches were applied to enable climbing behavior enable passive stability using open-loop gaits (e.g. Collins

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


4 The International Journal of Robotics Research 00(000)

et al. 2005), or otherwise through the use of reflexive con- The LittleDog robot has 12 actuators (two in each hip,
trol algorithms that tend to react to terrain. Recent appli- one in each knee) and a total of 22 essential degrees of
cations of these strategies, for example on the Rhex robot freedom (six for the body, three rotational joints in each
(Altendorfer et al. 2001) and BigDog (Raibert et al. 2008), leg, and one prismatic spring in each leg). By assuming that
have produced impressive results, but these systems do not the leg springs are over-damped, yielding first-order dynam-
take into account knowledge about upcoming terrain. ics, we arrive at a 40-dimensional state space (18 × 2 + 4).
Feedback control of underactuated “dynamic walking” However, to keep the model as simple (low-dimensional) as
bipeds has recently been approached using a variety of possible, we approximate the dynamics of the robot using
control methods, including virtual holonomic constraints a planar five-link serial rigid-body chain model, with rev-
(Chevallereau et al. 2003; Westervelt et al. 2003, 2007) with olute joints connecting the links, and a free base joint, as
which impressive results have been demonstrated for a sin- shown in Figure 3. The planar model assumes that the back
gle limit-cycle gait over flat terrain. In this paper, we use an legs move together as one and the front legs move together
alternative method based on the combination of transverse as one (see Figure 1). Each leg has a single hip joint, con-
linearization and time-varying linear control techniques necting the leg to the main body, and a knee joint. The foot
(Shiriaev et al. 2008; Manchester et al. 2009; Manchester of the real robot is a rubber-coated ball that connects to the
2010). This allows one to stabilize more general motions, shin through a small spring (force sensor), which is con-
however a nominal trajectory is required in advance, so this strained to move along the axis of the shin. The spring is
feedback controller must be paired with a motion planning stiff, heavily damped, and has a limited travel range, so it is
algorithm which takes into account information about the not considered when computing the kinematics of the robot,
environment. but is important for computing the ground forces. In addi-
tion, to reduce the state space, only the length of the shin
spring is considered. This topic is discussed in detail as part
2.3. Dynamic Maneuvers of the ground contact model.
In order to use sample-based planning for a highly dynamic, The model’s seven-dimensional configuration space, C =
underactuated robot, the search must take place in the com- R2 × T5 , consists of the planar position of the back foot
plete state space, as velocities play an important role in ( x, y), the pitch angle ω, and the 4 actuated joint angles
the dynamics. This effectively doubles the dimension of the q1 , . . . , q4 . The full state of the robot, x = [q, q̇, l] ∈ X , has
search. Furthermore, when there are underactuated dynam- 16 dimensions and consists of the robot configuration, the
ics, the robot cannot accelerate in arbitrary directions, and corresponding velocities, and the two prismatic shin-spring
therefore can only move in state space in very limited direc- lengths, l = [l1 , l2 ], one for each foot. The control com-
tions. This makes straightforward application of sample- mand, u, specifies reference angles for the four actuated
based planning extremely challenging for these types of joints. The robot receives joint commands at 100 Hz and
systems. In the second phase of the LittleDog program, our then applies an internal PD controller at 500 Hz. For sim-
team began to integrate dynamic lunging, to move two feet ulation, planning and control purposes, the dynamics are
at a time (Byl et al. 2008; Byl and Tedrake 2009), into the defined as
otherwise quasi-static motion plans to achieve fast loco- x[n + 1] = f ( x[n], u[n]) , (1)
motion over rough terrain. This paper describes the MIT where x[n + 1] is the state at t[n + 1], x[n] is the state at t[n],
team approach in the third (final) phase of the project. We and u[n] is the actuated joint position command applied
show that careful foot placement can be combined with during the time interval between t[n] and t[n+1]. We some-
highly dynamic model-based motion planning and feedback times refer to the control time step, T = t[n + 1] − t[n] =
control to achieve continuous bounding over very rough 0.01 seconds. A fixed-step fourth-order Runge–Kutta inte-
terrain. gration of the continuous Euler–Lagrange dynamics model
is used to compute the state update.
A self-contained motor model is used to describe the
3. LittleDog Model movement of the actuated joints. Motions of these joints
An essential component of any model-based planning are prescribed in the five-link system, so that as the dynam-
approach is a sufficiently accurate identification of the sys- ics are integrated forward, joint torques are back-computed,
tem dynamics. Obtaining an accurate dynamic model for and the joint trajectory specified by the model is exactly fol-
LittleDog is challenging owing to subtleties in the ground lowed. This model is also constrained so that actuated joints
interactions and the dominant effects of motor saturations respect bounds placed on angle limits, actuator velocity lim-
and transmission dynamics. These effects are more pro- its, and actuator torque limits. In addition, forces computed
nounced in bounding gaits than in walking gaits, due to the from a ground contact model are applied to the five-link
increased magnitude of ground reaction forces at impact chain when the feet are in contact with the ground. The
and the perpetual saturations of the motor; as a result, we motor model and ground contact forces are described in
required a more detailed model. In this section, we describe more detail below. The actuated joints are relatively stiff,
our system identification procedure and results. so the model is most important for predicting the motion of
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 5

Fig. 3. LittleDog model. The state space is X = [q, q̇, l], where q = [x, y, ω, q1 , q2 , q3 , q4 ], and l = [l1 , l2 ] are feet spring lengths used
in the ground contact model. The diagram also illustrates the geometric shape of the limbs and body, information used for collision
detection during planning.

the unactuated degrees of freedom of the system, in partic- seen from the figure, the motor model does well in tracking
ular the pitch angle, as well as the horizontal position of the the actual joint position and velocity. Under large dynamic
robot. loads, such as when the hip is lifting and accelerating the
whole robot body at the beginning of a bound, the model
might slightly lead the actual joint readings. This can be
3.1. Motor Model seen in Figure 4 (top) at 5.4 s. For the knee joint and for
The motors on LittleDog have gear ratios of approximately less aggressive trajectories with the hip, the separation is
70 : 1. Because of the high gear ratio, the internal second- not significant. In addition, note that backlash in the joints
order dynamics of the individual motors dominate in most is not modeled. The joint encoders or located on the motors
cases, and the rigid-body dynamics of a given joint, as well rather than the joint axes, which makes it very difficult to
as effects of inertial coupling and external forces on the measure and model backlash.
robot can be neglected. The combination of the motor inter-
nal dynamics with the PD controller with fixed PD gains 3.2. Ground Interaction Model
can be accurately modeled as a linear second-order system:
LittleDog is mostly incapable of an aerial phase due to the
q̈i = −bq̇i + k( ui − qi ) , (2) velocity limits in the joints, so at least one foot is usually
in contact with the ground at any time. The ground interac-
where q̈i is the acceleration applied to the ith joint, given tion is complicated, because the robot’s foot may roll, slide,
the state variables [qi , q̇i ] and the desired position ui . To stick, bounce, or do some combination of these. A contin-
account for the physical limitations of actual motors, the uous, elastic ground interaction model is used, where the
model includes hard saturations on the velocity and accel- foot of the robot is considered as a ball, and at each point in
eration of the joints. The velocity limits, in particular, have time the forces acting on the foot are computed. The ground
a large effect on the joint dynamics. plane is assumed to be compressible, with a stiff non-linear
Each of the four actuated joints is assumed to be con- spring damper normal to the ground that pushes the foot
trolled by a single motor, with both of the knee joints having out of the terrain. A tangential friction force, based on a
one pair of identical motors, and the hip joints having a dif- non-linear model of Coulomb friction is also assumed. The
ferent pair of identical motors (the real robot has different normal and friction forces are balanced with the force of
mechanics in the hip versus the knee). Owing to this, two the shin spring at the bottom of the robot’s leg. The rate of
separate motor parameter sets: {b, k, vlim , alim } are used, one change of the shin spring, l̇, is computed by the force bal-
for the knees, and one for the hips. ancing and used to update the shin-spring length, l, which is
Figure 4 shows a typical fit of the motor model to real tra- a part of the state space. The resulting normal and friction
jectories. The fits are consistent across the different joints of forces are applied to the five-link model.
the robot and across different LittleDog robots, but depend Appendix A discusses the details of the foot roll calcula-
on the gains of the PD controller at each of the joints. As tion, the normal force model, the friction model, and their
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
6 The International Journal of Robotics Research 00(000)

Position (rad)

−0.5

−1
Encoder
Commanded
Motor Model
−1.5
5.2 5.4 5.6 5.8 6 6.2 6.4
Time (s)

8
6
4
Velocity (rad/s)

2
0
−2
−4
−6 Encoder
−8 Motor Model
−10
5.2 5.4 5.6 5.8 6 6.2 6.4
Time (s)

Fig. 4. Example of a hip trajectory, demonstrating position command (thin dashed red), motor model prediction (solid magenta), and
actual encoder reading (thick dashed blue).

use in computing ground contact forces. The ground contact fit and lists the parameters and their values. In total, 34
model is also illustrated in Figure 21 of the Appendix. parameters were measured or fit for the model.

3.4. Model Performance


3.3. Parameter Estimation Figure 5 shows a comparison of a bounding trajectory in
There are many coupled parameters that determine the simulation versus 10 runs of the same command executed
behavior of the model. In theory, they could all be fit on the real robot. The simulated trajectory was generated
to a large enough number of robot trajectories, but it using the planning algorithm described in the next section,
would require thoroughly exploring relevant regions of the which used the developed model. The control input and the
robot’s {STATE-SPACE × ACTION-SPACE}. This is difficult, starting conditions for all open-loop trajectories in the fig-
because LittleDog cannot easily be set to an arbitrary point ure were identical, and these trajectories were not used for
in state space, and the data we collect only covers a tiny fitting the model parameters.
fraction of it. An easier and more robust approach relies on Three of the four plots are of an unactuated coordi-
the model structure to separate the fitting into sub-problems nate (x, y, and body pitch), the fourth plot is of the back
and to identify each piece separately. The full dynamical hip, an actuated joint. The figure emphasizes the difference
model of the robot consists of the five-link rigid body, between directly actuated, position controlled joints com-
the motor model, and the ground force model. A series pared with unstable and unactuated degrees of freedom.
of experiments and a variety of short bounding trajecto- While the motor model tracks the joint positions almost per-
ries were used to fit the model parameters to actual robot fectly, even through collisions with the ground, the unactu-
dynamics by minimizing quadratic cost functions over sim- ated coordinates of the open-loop trajectories diverge from
ulation error. Appendix B discusses the approach for the each other in less than 2 seconds. Right after completing
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 7

0.15 0.14

0.1 0.12

0.05 0.1
x position (m)

y position (m)
0 0.08

−0.05 0.06

−0.1 0.04

−0.15 0.02

−0.2 0
0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5
Time(s) Time(s)
(a) x position (b) y position

4.2 −0.5

3.8
−1
Body pitch (rad)

Back hip (rad)

3.6

3.4

3.2
−1.5
3

2.8

2.6 −2
0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5
Time(s) Time(s)
(c) Body pitch (d) Back hip joint position

Fig. 5. The unactuated coordinates for a bounding motion (x, y, and body pitch) and a trajectory of one of the joints. The thick red line
shows a trajectory generated by planning with RRTs using the simulation model. The thin blue lines are 10 open-loop runs of the same
trajectory on a real LittleDog robot. In (c), the “x” shows where trajectories begin to separate, and the “o” show where trajectories finish
separating.

the first bounding motion, at about 1.5 s, the trajectories in Coulomb friction does not always hold for LittleDog
separate as LittleDog is lifting its body on the back feet. At feet. Modeling these effects is possible, but would involve
about 1.9 s, in half of the cases the robot falls forward and adding a large number of additional states with non-linear
goes through a second bounding motion, while in the rest high-frequency dynamics to the model, making it much
of the cases it falls backward and cannot continue to bound. harder to implement and less practical overall. In addition,
The horizontal position and body pitch coordinates are both the new states would not be directly observable using cur-
highly unstable and unactuated, making it difficult to stabi- rently available sensors, so identifying the related param-
lize them. The control problem is examined in more detail eters and initializing the states for simulation would be
later in this paper. difficult.
The most significant unmodeled dynamics in LittleDog In general, for a complex unstable dynamical system
include backlash, stiction in the shin spring, and more such as LittleDog, some unmodeled effects will always
complex friction dynamics. For example, even though the remain no matter how detailed the model is. Instead of cap-
friction model fits well to steady-state sliding of Little- turing all of the effects, the model approximates the overall
Dog, experiments on the robot show that during a bound- behavior of the system, as seen from Figure 5. We believe
ing motion there are high-frequency dynamics induced in that this model is sufficiently accurate to generate relevant
the legs that reduce the effective ground friction coeffi- motion plans in simulation which can be stabilized using
cient. Also, the assumption of linearity in the normal force feedback on the real robot.

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


8 The International Journal of Robotics Research 00(000)

model, x[n + 1] = f( x[n], u[n]), avoids collisions with the


virtual obstacles, ( x), does not violate the bounds on joint
positions, velocities, accelerations, and torques, and reaches
the goal position.
Given this problem formulation, it is natural to consider
a sample-based motion planning algorithms such as RRTs
due to their success in high-dimensional robotic planning
Fig. 6. Sketch of a virtual obstacle function, m ( x), in relation to problems involving complex geometric constraints (LaValle
the ground, γ ( x). and Branicky 2002). However, these algorithms perform
poorly when planning in state space (where the dynam-
ics impose “differential constraints”) (Cheng 2005; LaValle
4. Motion Planning Algorithm 2006), especially in high dimensions. When applied directly
to building a forward tree for this problem, they take pro-
4.1. Problem Formulation hibitive amounts of time and fail to make any substantial
progress towards the goal. In the following sections, we
Given the model described in the previous section, we can
describe three modifications to the basic algorithm. First,
formulate the problem of finding a feasible trajectory from
we describe a parameterized “half-bound” motion primitive
an initial condition of the robot to a goal region defined by
which reduces the dimensionality of the problem. Second,
a desired location of the COM. We describe the terrain as
we describe the Reachability-Guided RRT, which dynami-
a simple height function, z = γ ( x), parameterized by the
cally changes the sampling distribution and distance metric
horizontal position, x. We would like the planned trajectory
to address differential constraints and discontinuous motion
to avoid disruptive contacts with the rough terrain, however
primitive dynamics. Finally, we describe a mechanism for
the notion of “collision-free” trajectories must be treated
sampling with a Voronoi bias in the lower-dimensional task
carefully since legged locomotion requires contact with the
space defined by the motion primitive. All three of these
ground in order to make forward progress.
approaches were necessary to achieve reasonable run-time
To address this, we define a virtual obstacle function,
performance of the algorithm.
( x), which is safely below the terrain around candidate
foothold regions, and above the ground in regions where
we do not allow foot contact (illustrated in Figure 6). In our
previous experience with planning walking gaits (Byl et al.
4.2. Macro Actions/Motion Primitive
2008; Byl and Tedrake 2009), it was clear that challenging The choice of action space, e.g. how an action is defined for
rough terrain could be separated into regions with useful the RRT implementation, will affect both the RRT search
candidate footholds, as opposed to regions where footholds efficiency, as well as completeness guarantees, and, per-
would be more likely to cause a failure. Therefore, we had haps most importantly, path quality. In the case of planning
developed algorithms to pre-process the terrain to iden- motions for a five-link planar arm with four actuators, a
tify these candidate foothold regions based on some simple typical approach may be to consider applying a constant
heuristics, and we could potentially use the same algorithms torque (or some other simple action in joint space) that is
here to construct ( x). However, in the current work, which applied for a short constant time duration, T. One draw-
makes heavy use of the motion primitive described in the back of this method is that the resulting trajectory found by
following sections, we found it helpful to construct sepa- the RRT is likely be jerky. A smoothing/optimization post-
rate virtual obstacles, m ( x), parameterized by the motion processing step may be performed, but this may require
primitive, m, being performed. Once the virtual obstacles significant processing time, and there is no guarantee that
became motion primitive dependent, we had success with the local minima near the original trajectory is sufficiently
simple virtual obstacles as illustrated in Figure 6. The col- smooth. Another drawback of using a constant time step
lision function illustrated is defined relative to the current with such an action space is that in order to ensure com-
position of the feet. In the case shown in the figure, the vir- pleteness, T should be relatively small (for LittleDog
tual function forces the swing leg to lift up and over the bounding, 0.1 seconds seems to be appropriate). Empiri-
terrain, and ensures that the back foot does not slip, which cally, however, the search time increases approximately as
characterizes a successful portion of a bound. As soon as 1/T, so this is a painful trade-off.
the front feet touch back down to the ground after com- For a stiff PD-controlled robot, such as LittleDog, it
pleting this part of the bound, a new collision function is makes sense to have the action space correspond directly
defined, which takes into account the new footholds, and to position commands. To do this, we generate a joint tra-
forces the back feet to make forward progress in the air. jectory by using a smooth function, G, that is parameterized
We are now ready to formulate the motion planning by the initial joint positions and velocities, [q( 0) , q̇( 0)], a
problem for LittleDog bounding: find a feasible solu- time for the motion, Tm , and the desired end joint posi-
tion, {x[0], u[0], x[1], u[1], . . . , x[N]}, which starts in the tions and velocities, [qd ( Tm ) , q̇d ( Tm )]. This action set
required initial conditions, satisfies the dynamics of the requires specifying two numbers for each actuated degree
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 9

in a straight line. Because the front feet are light, and not
weight bearing, it is useful to “tuck” the feet, while mov-
ing them from the start to the end pose, in order to help
avoid obstacles. To take advantage of a rotational moment
produced by accelerating the stance leg, the back leg begins
moving a short time before the front foot leg starts moving.
Once both the hind and the front legs have reached their
desired landing poses, the remaining trajectory is held con-
stant until the front legs impact the ground. An example of
such a trajectory is shown in Figure 7.
Fig. 7. First half of a double bound. The robot is shown moving A similar approach is utilized to generate motions for
from right to left. The back leg “opens up”, pushing the robot for- the second phase of bounding, with the difference that the
ward (to the left) while generating a moment around the stance hip angles are “contracting” instead of “opening up”. As
foot causing the robot to rear-up. Meanwhile, the front legs tuck- the front leg becomes the stance leg, the front foot begins
in, and then un-tuck as they get into landing position. Axes are in moving backwards just before impact with the ground. The
meters. back leg movement is delayed for a short period to allow the
system to start rocking forward on to the front legs. When
both legs reach their final positions, the pose is held until
the back leg touches down. The back leg tucks in while
swinging into landing position to help avoid obstacles. The
resulting motions are shown in Figure 8.
Note that for the start and end conditions of the motion
primitive, the actuated-joint velocities are zero, a factor
which reduces the action space further. Using these motion
primitives requires a variable time step, Tbound , because
this has a direct influence on accelerations and, there-
fore, moments around the passive joints. However, for each
phase, one only needs to specify four degrees of freedom,
Fig. 8. Second half of a double bound. The front (left) foot moves corresponding to the pose of the system at the end of
towards the back (right) of the robot, helping to pull the robot for- the phase. Using this motion primitive, the entire bound-
ward. The back leg meanwhile swings forward, while tucking and ing motion is achieved with a simple, jerk-free actuator
un-tucking to help avoid collision with the terrain. trajectory. Because these primitives are naturally smooth,
post-processing the RRT generated trajectory becomes an
optional step.
of freedom: one for the desired end position and one for the
The motion primitive framework used here is somewhat
desired end velocity. A smooth function generator which
similar in approach to the work in Frazzoli et al. (2005).
obeys the end point constraints, for example a cubic-spline
In addition to formally defining motion primitives in the
interpolation, produces a trajectory which can be sampled
context of motion planning, that work proposed planning
and sent to the PD controller.
in the framework of a Maneuver Automaton. The automa-
If one considers bounding in particular and examines
ton’s vertices consist of steady-state trajectories, or trim
how animals, and even some robots such as the Raibert
primitives (in the context of helicopter flight), which are
hoppers, are able to achieve bounding behavior, it becomes
time invariant and with a zero-order hold on control inputs.
apparent that some simplifications can be made to the action
The edges of the automaton are maneuver primitives, which
space. We can define a motion primitive that uses a much
are constrained so that both the start and the end of the
longer T and, therefore, a shorter search time, while also
maneuver are compatible with the trim primitives. This is
producing smooth, jerk-free joint trajectories. The insight
illustrated in the context of LittleDog bounding in Figure
is based on the observation that a bound consists of two
9. Of course, feet collisions in LittleDog act like impulses
phases: (1) rocking up on the hind legs while moving the
or hybrid dynamics, which negate the underlying invariance
front legs forward and (2) rocking up on the front legs, while
assumptions of a trim primitive in Frazzoli et al. (2005) and
the hind legs move forward. In the first half-bound primi-
Frazzoli et al. (2002), but the idea is still representative.
tive, the hind legs begin moving backwards, propelling the
body forwards. This forward acceleration of the COM also
generates a rotational moment around the pseudo-pin joint
at the hind foot–ground interface. In this case, the front
4.3. Reachability-guided RRT Overview
legs come off the ground, and they are free to move to a Sample-based planning methods such as the RRT can be
position as desired for landing. In this formulation, the hind very fast for certain applications. However, such algorithms
legs move directly from the starting pose to the ending pose depend on a distance metric to determine distance from
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
10 The International Journal of Robotics Research 00(000)

which would invalidate the use of a proper Euclidean dis-


tance metric that assumes a smooth continuous space of
actions. The idea of Reachability can be an especially pow-
erful notion for sample-based planning in the context of
macro actions, as the reachable region does not have to
be local to its parent node. A continuous motion primitive
action space can be approximated by sampling in the action
space, resulting in a discrete set of reachable nodes, which
might be far in state space from their parent node. This
idea suggests that the RG-RRT also naturally extends to
models with hybrid dynamics, simply by sampling actions
and integrating through the dynamics when generating the
reachable set.
Definition 1. For a state x0 ∈ X , we define its reachable
Fig. 9. Maneuver automaton for LittleDog bounding. set, Rt ( x0 ), to be the set of all points that can be achieved
from x0 in bounded time, t ∈ [Tlow , Thigh ], according
to the state equations (1) and the set of available control
inputs, U.
samples to nodes in the tree. The Euclidean distance metric
is easy to implement, and works well for holonomic path
Algorithm 1 T ← BUILD-RG-RRT( xinit )
planning problems, but it breaks down when constraints
imposed by the governing equations of motion restrict the 1: T ← I NITIALIZE T REE ( xinit );
direction that nodes can grow in. To deal with this, we 2: R ← A PPROX R( [ ], T , xinit );
developed a modified version of the RRT algorithm called 3: for k = 1 to K do
the Reachability-Guided RRT (RG-RRT) (Shkolnik et al. 4: RejectSample ← true;
2009). The algorithm uses rejection sampling, which was 5: while RejectSample do
previously shown to be beneficial for growing trees in con- 6: x rand ← RANDOMSTATE( ) ;
figuration space through narrow tunnels (Yershova et al. 7: [ ], distT ←  NEARESTSTATE( xrand , T ) ;
2005). Reachability guidance extends this idea to planning 8: rnear , distR ← NEARESTSTATE( xrand , R) ;
for dynamic systems in state space. The RG-RRT biases 9: if distR < distT then
the RRT search toward parts of state space that are locally 10: RejectSample ← false;
reachable from the tree. Reachability information is stored 11: xnear ← ParentNode( rnear , R, T ) ;
in each node in the tree; the algorithm works by sampling 12: end if
and then measuring the Euclidean (or scaled-Euclidean) 13: end while
distance to each node in the tree, as well as to each reach- 14: u ← SOLVECONTROL( xnear , xrand ) ;
able set of the tree. If a sample is closer to the tree itself, 15: [xnew , isFeasible] ← NEWSTATE( xnear , u) ;
rather than to the reachable set of the tree, then there are no 16: if isFeasible then
actions which are able to extend that node in the direction 17: T ← INSERTNODE( xnew , T );
of the sample. This sample–node pair is therefore not useful 18: R ← APPROXR( R, T , xnew );
for exploration, and is discarded. On the other hand, if the 19: if ReachedGoal(xnew ) then
sample is closer to a reachable region, the parent node in the 20: return T
tree, which generated the closest reachable region, is grown 21: end if
towards the sample. Thus, by keeping track of locally reach- 22: end if
able parts of the tree and using a simple Euclidean-type 23: end for
distance metric, the algorithm is able to overcome many 24: return [ ]
of the inherent difficulties when implementing an RRT for
dynamic systems. The structure of the RG-RRT algorithm is outlined in
In this work, we show that the RG-RRT algorithm can Algorithm 1. Given an initial point in state space, xinit , the
be useful for motion planning in a system with motion first step is to initialize the tree with this state as the root
primitives. Conceptually, a motion primitive in a dynamic node. Each time a node is added to the tree, the APPROXR( )
system is one which occurs over a fairly substantial dura- function solves for Rt ( xnew ), an approximation of the set
tion of time, e.g. an action which moves the system from of reachable points in the state space that are consistent with
one state to another state that is relatively distant from the differential constraints (1). The approximated reach-
the current state. Therefore, the states produced by tak- able set for the whole tree is stored in a tree-like structure,
ing macro actions or motion primitives may be discon- Rt ( T ), or simply R for shorthand notation, which con-
tinuous or even discrete in relation to a generating state, tains reachable set information as well as pointers to the
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 11

Fig. 10. Two consecutive steps in a single iteration of the RG-RRT, as demonstrated for the underactuated pendulum. In (a), a random
sample is drawn from the state space and paired with the nearest node in the tree and the closest point in its reachable set according to
the Euclidean distance metric. Shown in (b), the algorithm then expands the node towards the point in the reachable set and adds that
point to the tree. Only the reachable region of the newly added point is shown here for clarity.

parent nodes for each node in the corresponding tree, T . be efficiently tested for collisions before they are added to
For many systems of interest, the approximate bounds of the reachable set approximation. This reduces the likelihood
the reachable set, Rhull , can be generated by sampling over of trajectories leaving free space as part of the exploration
the limits in the action space, and then integrating the cor- phase.
responding dynamics forward. For these systems, assuming After a node and its corresponding reachable set is added
a relatively short action time step, t, integrating actions to the tree, a random state-space sample, xrand , is drawn.
between the limits results in states that are within, or rea- The NEARESTSTATE( xrand , [T , R]) function compares the
sonably close to being within, the convex hull of Rhull . In distance from the random sample to either the nodes of
such cases it is sufficient to consider only the bounds of the tree, T , or the reachable set of the tree, R. The func-
the action set to generate the reachability approximation. tion returns the closest node as well as the distance from
When the dimension of the action space becomes large, it the node to the sample. Samples that are closer to the tree,
may become more efficient to approximate the reachable T , rather than the reachable set, R, are rejected. Sampling
set with a simple geometric function, such as an ellipsoid. continues until a sample is closer to R, at which point, the
We found that the reachable set approximation does not ParentNode function returns the node in T , which is the
need to be complete, and even a crude approximation of parent of the closest reachable point in R.
the set can vastly improve planner performance in systems Consider planning on a simple torque-limited pendulum,
with dynamics. Another benefit of approximating the reach- which has interesting dynamics, and can be easily visual-
able set by sampling actions is that the resulting points can ized, because the state space is two dimensional. A single

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


12 The International Journal of Robotics Research 00(000)

by discretizing the action space. However, to achieve Lit-


tleDog bounding, even a reduced four-dimensional action
space becomes too large to discretize efficiently. For exam-
ple, using only three actions per dimension (and assuming
a constant motion primitive time step), there would be 81
actions to apply and simulate in order to approximate the
reachable set for each node.
Instead of discretizing in such a high-dimensional action
space, the reachable set can be approximated by under-
standing the failure modes of bounding. Failures may occur
primarily in one of three ways: (1) the robot has too much
energy and falls over; (2) the robot has too little energy,
so the stance foot never leaves the ground, violating our
assumption that one foot always leaves the ground in a
bounding gait; or (3) a terrain collision occurs. In the case
when the system has a lot of kinetic energy, the best thing
to do is to straighten all of the limbs, which raises the COM
and converts the kinetic energy into potential energy. At
the other extreme, if the robot does not have much kinetic
energy, an action that lowers the COM while accelerating
the limbs inwards tends to produce a rotational moment, if it
is possible to do so. Thus, two extreme motions can be gen-
erated, for each phase of bounding, which prevent the two
common failure modes. The reachable set is then generated
by interpolating joint positions between the two extremes.
Fig. 11. RG-RRT Voronoi diagrams for a pendulum (best viewed
This one-dimensional discretization is usually rich enough
in color online). The diagram in (a) corresponds to the RG-RRT
to capture the energy related failure modes and generate
after 30 nodes have been added while that in (b) corresponds to
bounds which strike the right balance to continue bounding
the tree with 60 nodes. Magenta dots are discretely sampled reach-
further.
able points affiliated with each tree. Green regions are areas where
The reachable set helps the RG-RRT to plan around
samples are “allowed”, and correspond to Voronoi areas associ-
Regions of Inevitable Collisions (RIC) (LaValle and
ated with the reachable set. Samples that fall in gray areas are
Kuffner 2001; Fraichard and Asama 2003; Fraichard 2007;
discarded.
Chan et al. 2008). Nodes which have empty reachable sets,
which may occur because the node has too much or too lit-
expansion step of the RG-RRT is shown in Figure 10. The tle energy, in the case of LittleDog bounding, are in XRIC ,
tree nodes are shown in black, with the reachable approx- even if the node itself is collision free (Xfree ). The RG-RRT
imation shown in gray. Figure 11 shows the associated takes advantage of this, because when a node has an empty
Voronoi regions of the combined set of { R, T } for an RG- reachable set, the node serves as a “blocking” function for
RRT growing on a torque-limited pendulum. The rejected sampling. Owing to the rejection sampling of the RG-RRT,
sampling region is shown in gray, while the allowed regions such a node cannot be expanded upon, and any samples that
are shown in green. Note that the sampling domain in the map to this node will be discarded, encouraging sampling in
RG-RRT is dynamic, and adapts to the tree as the tree other areas that can be expanded.
expands, producing a Voronoi bias that is customized by
the system dynamics defined by the tree. Also note that
although the Euclidean distance is used, we are warping this 4.5. Sampling in Task Space
distance metric by computing the distance to the reachable In the RRT algorithm, sampling is typically performed
set rather than to the tree. This warped metric produces a uniformly over the configuration space. An action is cho-
Voronoi bias to “pull” the tree in directions in which it is sen which is expected to make progress towards the sam-
capable of growing. ple. The sampling creates a Voronoi bias for fast explo-
ration by frequently selecting nodes of the tree near unex-
plored regions, while occasionally refining within explored
4.4. Approximating the Reachable Set regions. We have previously shown that the Voronoi bias
We have shown that Reachability Guidance can improve can be implemented in the configuration (or state) space
RRT planning time in underactuated systems such as the or alternatively the bias can be in a lower-dimensional task
torque-limited pendulum and Acrobot (Shkolnik et al. space (Shkolnik and Tedrake 2009). This can be achieved
2009). In those systems, the reachable set was approximated simply by sampling in task space, and finding the node
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 13

in the configuration-space tree, the projection of which results were found by specifying a change in joint
is closest to the task-space sample. Reducing the dimen- positions (qa ), and using a smooth function generator to
sionality of the search with a task-space Voronoi bias can create position splines with zero start and end velocities.
significantly improve search efficiency and, if done care- This approach worked best using time intervals of 0.1–0.15
fully, does not have an impact on the completeness of the seconds, but the resulting RRT was not able to find a com-
algorithm. plete bounding motion plan in a reasonable amount of time
As described in the previous section, our action space (hours). By implementing reachability-guidance and task-
involves a half-bound (half a period of a complete bound). space sampling, the planner was able to find a double-bound
At the start and end of an action (e.g. the state at any trajectory on flat terrain, after 2–3 minutes of search on
given node on the tree), the robot is approximately touching average. Some of these plans were successfully executed
the ground with both feet, and joint velocities are approxi- on flat terrain by the actual robot. In addition to being rela-
mately zero. Samples are therefore similarly constrained. In tively slow to plan motions, the trajectories returned by the
addition, samples are chosen such that they are not in col- RRT had high-frequency components which were difficult
lision, and respect joint bounds, with some minimum and for the robot to execute without saturating motors. The long
maximum stance width. The region in actuated joint space planning time and sub-optimal gaits were not ideal for this
can be mapped to a region in Cartesian space for the back task. Plans might be smoothed and locally optimized, but
and front foot, corresponding to a four-dimensional mani- this is a time-consuming process given the complexity of
fold. A sample is drawn by first choosing a horizontal coor- the dynamics.
dinate, x, of the robot’s back foot, and then selecting four When we utilized the half-bound motion primitive, the
joint angles from the manifold, while checking to ensure trajectories produced were smooth by the way the primitive
that collision constraints are validated. is defined, with one hip movement and one knee move-
Given a sample described above, the y-coordinate of the ment per each half-bound motion. Furthermore, the half
robot foot is set to the ground position at x, and the pas- bound has a relatively long duration, typically 0.5 seconds
sive joint is computed using the constraint that both feet are in length, which shortens the search time. The complete
on the ground. Thus, sampling the five-dimensional space planner described in this section is able to plan a double-
maps to a point qs in the seven-dimensional configuration bound on flat terrain in a few seconds. A continuous bound
space of the robot. The five-dimensional sampling space, over 2 m is typically found in less than a minute. The com-
which neglects velocities, is significantly smaller than the plete planner was also able to plan over intermittent terrain,
complete 16-dimensional state space, and produces a task- where foot holds were not allowed in given regions. In addi-
space Voronoi Bias. When a sample is drawn, the closest tion, the planner was successful in handling a series of
node is found by minimizing the Euclidean distance from 7-cm steps, and was also successful in planning bounds to
the sample to the tree, as well as to the reachable region of get up on top of a terrain with artificial logs with a maxi-
the tree. The sampling is repeated until a sample closest to mum height difference of 8 cm. The simulated log terrain
the reachable region is found. An action, u, is then created corresponds to a laser scan of real terrain board used in
by selecting the four actuated joint angles from the sample, the Learning Locomotion program to test LittleDog perfor-
qs . An action time interval Tbound is chosen by uniformly mance. An example of a bounding trajectory is shown in
sampling from T ∈ [0.3, 0.7] seconds. Figures 12 and 13, and a video of this trajectory is included
in Extension 1. The bottom link in the robot leg is 10 cm,
and the top link is 7.5 cm; given that the bottom of the body
is 3 cm below the hip, variations of 7 cm in terrain height
4.6. Simulation Results represents approximately 50% of maximum leg travel of the
In this section, we have presented three modifications to robot.
the standard implementation of the RRT to plan bound- Planning was also attempted using the motion primi-
ing motions with LittleDog: (1) a simple motion primitive; tive, but without reachability guidance. To do so, samples
(2) reachability guidance; and (3) task-space biasing. Each were drawn in each iteration of the RRT algorithm, and the
of these components could be implemented separately, but nearest-neighbor function returned the closest node in the
they worked particularly well when combined. To show tree. Because the closest node in the tree and the sample
this, we qualitatively compare results by applying various often looked similar, it did not make sense to try to expand
combinations of these modifications. towards the sample. Instead, once a sample was chosen, a
First, a standard RRT was implemented, without any motion primitive action was chosen at random. Using ran-
modifications, with the task of finding bounding motions dom actions is an accepted approach for RRTs, and the sam-
on flat terrain. The state space was sampled uniformly in pling itself produces a Voronoi bias to encourage the tree to
the regions near states that have already been explored. We expand into unexplored regions of space, on average. In this
experimented with several types of action spaces, includ- case, however, the motion primitive-based RRT was never
ing a zero-order hold on accelerations in joint space, or a able to find feasible plans over rough terrain without reach-
zero-order hold on velocities in joint space. Reasonable ability guidance, even when given over 12 hours to do so.

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


14 The International Journal of Robotics Research 00(000)

Fig. 12. Bounding trajectory over logs (Part 1).

Fig. 13. Bounding trajectory over logs (Part 2).

The combination of motion primitives, task-space biasing motion primitives described in this paper, these motion
and reachability guidance, however, is able to plan trajec- primitives were used to achieve a short bound sequence over
tories over the log terrain repeatedly, within 10–15 minutes the log terrain. Unlike in the model, the actual logs are not
of planning time required on average. Of course, on simpler planar. To address this, tracks were laid on the logs corre-
terrain, the planning time is much faster. sponding to the stance width of the robot along the terrain.
The planner was constrained to only allow foot holds where
the points on adjacent tracks were of the same height. With
4.7. Experimental Results some tuning, the motion primitives produced bounding over
In experiments with the real robot, open-loop execution of the logs on the real robot, shown in Figure 14. This trajec-
the motion plan found by the RRT quickly diverged with tory was successful approximately 20% of the time, even
time from the model predictions, even on flat terrain. Tra- though care was taken to have the same initial position for
jectories are unstable in the sense that given the same initial each trial.
conditions on the robot and a tape of position commands Feedback stabilization is required to enable execution of
to execute, the robot trajectories often diverge, sometimes long bounding trajectories. Feedback can compensate for
catastrophically. To demonstrate the feasibility of using the model errors and the inherent instability of the system itself.

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


Shkolnik et al. 15

Fig. 14. Bounding over logs with LittleDog.

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


16 The International Journal of Robotics Research 00(000)

is often the unactuated coordinate (e.g. the ankle angle), or


some function of all coordinates (e.g. angle from foot con-
tact point to hip), which is monotonic over each step and
can thus be considered a reparametrization of time. The pro-
jection (step 2) is done by computing the current value of
the phase coordinate and finding the unique point on the
planned trajectory with the same value. The positions of all
actuated joints are then synchronized to this joint via high-
gain feedback. This method has been successful in biped
control (Chevallereau et al. 2003) but poses challenges for
quadruped bounding due to the lack of a configuration vari-
able that evolves monotonically over each half-bound and
Fig. 15. The sensing and control environment for LittleDog. can be used as a phase variable.
An alternative approach is the transverse linearization,
in which the projection can be a more general function.
Trajectory optimization may also help to improve perfor- Transverse linearizations are a classical tool by which to
mance. Care was taken to ensure that the simulation is study stability of periodic systems (Hahn 1967; Hale 1980;
smooth and continuous, so gradients can be used to help Hauser and Chung 1994) and more recently have been used
in these tasks. The next section describes a feedback con- for constructive controller design (Song and Zefran 2006;
trol strategy, and discusses the feasibility for this strategy to Shiriaev et al. 2008; Manchester et al. 2009; Manchester
stabilize bounding gaits on rough terrain. 2010). The n-dimensional dynamics around the target tra-
jectory are decomposed into two parts: (1) a scalar “phase
variable” τ which represents the position along the trajec-
5. Feedback Stabilization tory and can be considered a reparametrization of time; and
Figure 15 shows the sensing and control environment for (2) an ( n−1)-dimensional vector x⊥ representing dynamics
LittleDog. For most feedback control methods, one selects transverse to the target trajectory.
a state space suitable for the task at hand, and uses an In some region around the target orbit, the continuous-
observer to construct an estimate of the current state using time dynamics in the new coordinate system are well
available measurements. An accurate observer requires a defined and have the form
good dynamical model of the robot and knowledge of the
characteristics (e.g. delays and noise) of each measurement. ẋ⊥ = A( τ ) x⊥ + B( τ ) δu + h( x⊥ , τ ) , (3)
Having estimated the state, the controller must select an τ̇ = 1 + g( x⊥ , τ ) . (4)
action to take, which stabilizes the bounding motion. This
is a non-trivial task because of the highly non-linear nature Here δu = u − u ( τ ), h( ·) contains terms second order and
of the dynamics and the limited control authority available higher in x⊥ , and g( ·) contains terms first order and higher
for the unactuated coordinate. Several recent efforts in con- in x⊥ and τ .
trol of underactuated systems have found success in taking The first-order approximation to the transverse compo-
a three-stage approach (Westervelt et al. 2003; Song and nent is known as the transverse linearization:
Zefran 2006; Westervelt et al. 2007; Shiriaev et al. 2008;
Manchester et al. 2009): ẋ⊥ = A( τ ) x⊥ + B( τ ) δu. (5)
1. From encoder, inertial measurement unit (IMU), and
motion-capture measurements, estimate the current Exponential stabilization of the transverse linearization is
state of the system in terms of generalized coordinates equivalent to orbital exponential stabilization of the orig-
ˆ t) ).
x̂( t) = ( q̂( t) , q̇( inal non-linear system to the target motion. A construc-
2. Based on the current state estimate x̂( t), find the loca- tion based on a Lagrangian model structure (Shiriaev et al.
tion on the planned trajectory which is “closest” in some 2008) has been used to stabilize non-periodic motions of
reasonable sense. That is, perform a projection to a an underactuated biped over rough terrain, and validated in
point in the planned motion x ( τ ) where τ is computed experiments (Manchester et al. 2009). Our model of Lit-
as some function of x̂( t). tleDog includes highly non-linear compliance and ground-
3. From the deviation x̂( t) −x ( τ ), and some precomputed contact interactions, and does not fit in the Lagrangian
local control strategy, compute control actions which framework. To derive the controller we used a more gen-
bring the system closer to the desired trajectory. eral construction of the transverse linearization which will
be presented in more detail elsewhere (Manchester 2010).
For example, in the method of virtual constraints for To stabilize LittleDog, we discretized the transverse lin-
biped control (Westervelt et al. 2003, 2007), the planned tra- earization (5) to a zero-order-hold equivalent, and com-
jectory is parametrized in terms of a phase variable which puted controller gains using a finite-time LQR optimal
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 17

control on the transverse linearization, i.e. the controller lighter weighting on θ̇ , and much lighter weighting on the
minimizing the following cost function: remaining coordinates.
Projecting in this way resulted in significantly improved

T
  robustness compared with projection in the original coordi-
J = x⊥ ( k) Q( k) x⊥ ( k) +δu( k) R( k) δu( k) nates under a standard Euclidean metric. One can explain
k=0
this intuitively in terms of controllability: the distance to a
+x( T) QF x( T) point on the target trajectory should be heavily weighted in
terms of the states which are most difficult to control. The
where Q( ·) , R( ·) , and QF are weighting matrices. This angles of the actuated joints and the position of the swing
gives a sequence of optimal controller gains: foot can be directly controlled at high speed. Therefore,
deviations from the nominal trajectories of these coordi-
δu( k) = −K( k) x⊥ ( k)
nates are easy to correct and these coordinates are weighted
which are computed via the standard Riccati difference lightly. In contrast, the angle of the COM and its velocity
equation. are not directly controlled and must be influenced indi-
From the combination of a projection P from x to ( τ , x⊥ ) rectly via the actuated links, which is more difficult. There-
and the LQR controller, one can compute a controller for fore, deviations in these coordinates have a much heavier
the full non-linear system: weighting.
For choosing the optimization weighting matrices Q and
( τ , x⊥ ) = P( x) , (6) R we have a similar situation: the dynamics of Little-
Dog are dominated on θ and θ̇ . Roughly speaking, these
u = u ( τ ) −K( τ ) x⊥ , (7)
states represent its “inverted pendulum” state, which is a
where K( τ ) is an interpolation of the discrete-time control common reduced model for walking and running robots.
gain sequence coming from the LQR. We refer to this as the Although we derive a controller in the full state space,
“transverse-LQR” controller. the optimization is heavily weighted towards regulating
Note that the above strategy can be applied with a θ and θ̇ .
receding-horizon and particular final-time conditions on
the optimization in order to ensure stability (Manchester
et al. 2009). We found with LittleDog, however, that relax- 5.2. Simulation Results
ing these conditions somewhat by performing a finite- We have simulated the transverse-linearization controller
horizon optimization over each half-bound, with QF = 0, with a number of RRT-generated bounding trajectories. For
was sufficient, even though it is not theoretically guaran- some trajectories over flat terrain in simulation, we were
teed to give a stabilizing controller. In this framework, able to stabilize these trajectories even when adding a com-
the transverse-LQR controller is applied until just before bination of: (1) slight Gaussian noise to the state estimate
impact is expected. During the impact, the tape is executed (σ = 0.01 for angles, σ = 0.08 for velocities), (2) signifi-
open loop, and the next half-bound controller takes over cant velocity perturbations up to 1 rad s−1 after each ground
shortly after impact. impact, (3) model parameter estimation error of up to 1 cm
in the main body COM position, and (4) delays of up to
5.1. Selection of Projection and Optimization 0.04 seconds. In this section we will show some results of
stabilization on two example terrains: bounding up stairs
Weights and bounding over logs.
Using the transverse-LQR controller we were able to Since the impact map is the most sensitive and difficult-
achieve stable bounding in simulations with several types of to-model phase of the dynamics, we can demonstrate the
perturbations. The success of the method is heavily depen- effectiveness of the controller by adding normally dis-
dent on a careful choice of both the projection P( x) and the tributed random angular velocity perturbation to the passive
weighting matrices Q and R. (stance–ankle) joint after each impact. Figure 16 shows
For the projection, LittleDog was represented as a float- example trajectories of LittleDog bounding up stairs (per-
ing five-link chain but in a new coordinate system: the first turbations with standard deviation of 0.2 rad s−1 ), and Fig-
coordinate was derived by considering the vector from the ure 17 shows it bounding over logs (perturbations with
point of contact of the stance foot and the COM of the standard deviation of 0.1 rad s−1 ).
robot. The angle θ of this line with respect to the hori- In each figure, the trajectory of the COM is plotted for
zontal was the first generalized coordinate. The remaining three cases: (1) the nominal (unperturbed) motion, as com-
coordinates were the angles of the four actuated joints, and puted by the RRT planner, (2) running the nominal control
the ( x, y) position of the non-stance (swing) foot. In this inputs open-loop with passive-joint velocity perturbations,
coordinate system, the projection on to the target trajec- and (3) the transverse-LQR stabilized robot with the same
tory was taken to be the closest point under a weighted perturbations. One can see that for both terrains, after the
Euclidean distance, with heaviest weighting on θ , somewhat first perturbation, the open-loop robot deviates wildly and
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
18 The International Journal of Robotics Research 00(000)

Fig. 16. Illustration of bounding up steps with center-of-mass trajectories indicated for nominal, open loop with perturbations, and
stabilized with perturbations.

Fig. 17. Illustration of bounding over logs with center-of-mass trajectories indicated for nominal, open loop with perturbations, and
stabilized with perturbations.

falls over. This shows the inherent instability of the motion. 5.3. Experimental Results
In contrast, the stabilized version is able to remain close to We are currently working to implement the transverse-LQR
the nominal trajectory despite the perturbations. Videos of system on the real robot. At present, the main difficulty
these trajectories can be found in Multimedia Extension 1. stems from the measurement system on the experimental
We analyze this behavior in more detail in Figure 18. platform, seen in Figure 15. In order to implement our con-
Here we depict the phase portrait of the COM angle θ troller, we must have knowledge of the state of the robot,
and its derivative θ̇ during a single bound for the same and in experiments this is derived from a combination of
three cases. This coordinate is not directly actuated, and an on-board IMU, motor encoders, and the motion-capture
can only be controlled indirectly via the actuated joints. system.
Note that the nominal trajectory comes quite close to the The motion-capture system has quite large delays, of the
state θ = π/2, θ̇ = 0. This state corresponds to the robot order of 0.02–0.03 s, and the IMU is noisy. Furthermore, the
being balanced upright like an inverted pendulum, and is joint encoders measure position of the motors, which can be
an unstable equilibrium (cf. the separation of trajectories substantially different to the true joint positions due to back-
in Figure 5). One can see that when running open-loop, a lash. When we implement our controller on the real robot,
small perturbation in velocity pushes the robot to the wrong we typically see large, destabilizing oscillations. We can
side of this equilibrium, and the robot falls over. In contrast, reproduce similar oscillations in our simulation by includ-
the transverse-LQR stabilized robot moves back towards the ing significant noise and delay in the state estimate used for
nominal trajectory. feedback control (see Figure 19).

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


Shkolnik et al. 19

.8 rad/sec perturbation in passive joint velocity


with a 16-dimensional state space. A physics-based simula-
3
tion was developed and identified using data from the real
robot. An efficient RRT-based planner was implemented.
2
This planner used motion primitives to reduce the size of the
1
action space, and also to produce smooth trajectories with-
out an optimization step. Task-space guidance was imple-
com angular velocity

0
mented by sampling from a five-dimensional subset of the
state space. The reduction in sampling dimension ignores
−1 pitch, vertical position, and velocities which significantly
improves search efficiency. To handle challenging dynamic
−2 constraints associated with bounding, we used reachability
guidance, so that random samples were chosen such that
open loop
−3 open loop perturbed they were closer to reachable regions of the tree, rather than
TL−LQR perturbed to the tree itself. Doing so increases the likelihood that the
−4 expansion step of the RRT algorithm can make progress in
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
the direction of the sample, which dramatically improves
com angle
RRT performance. The RRT motion planner was demon-
Fig. 18. Phase portrait of center-of-mass angle θ versus its deriva- strated to achieve bounding over steps and over logs, with
tive θ̇ for (1) a nominal half-bound trajectory, (2) open-loop exe- terrain height differences corresponding to approximately
cution with perturbation of +0.8 rad s−1 in the initial condition, 50% of the leg length. Without reachability guidance,
and (3) transverse-LQR stabilized trajectory with perturbed initial 12 hours was not sufficient to plan in the otherwise iden-
state. tical conditions. Using task-space biasing and reachability
guidance, we were able to plan within minutes. A primary
reason for this is that a standard RRT implementation does
8
not look ahead. Many of the nodes on the outer edges of
6 the tree (the exact nodes which the Voronoi bias selects
most often, because they are near the unexplored regions
4
of state space) correspond to nodes that are either incapable
Body angle velocity (rad/s)

2 of bounding on the next step because not enough energy


0 is carried over, or have too much energy and no possible
action could be applied on the next step to keep the robot
−2
from falling over. Expanding on these nodes is futile, but the
−4 standard RRT will try to expand them repeatedly causing
−6
the algorithm to fail to find a plan.
The motion primitives used for the motion planner were
−8 tested on the robot, and shown to be capable of bounding
robot
−10 simulation on rough terrain. The trajectories implemented on the robot
using the motion primitives are naturally smooth, and our
10 20 30 40 50 60 70 80
model captures the resulting dynamics quite well for short
Timesteps (0.01 s)
trajectories. However, the forward simulation and the actual
Fig. 19. Oscillations induced by measurement noise and delays: behavior of the robot tended to diverge after about 1 second,
simulation and experiment. despite having a relatively elaborate physics model of the
robot that was identified on data collected from the robot.
This is not surprising since the robot trajectories from sim-
We believe that if we can obtain better state estimates ilar initial conditions can diverge when executing the same
with less noise and reduced delay, we will be able to achieve open-loop trajectories. To address this, a transverse-LQR
stable bounding on the real robot, but this remains to be feedback controller was developed to stabilize the motion
tried in future work. plan returned by the RRT. Unlike standard time-varying
LQR methods, the transverse LQR makes no attempt to
force the robot to converge to a trajectory in time, but rather
forces it to converge to the trajectory as a path through
6. Concluding Remarks and state space, i.e. it is orbitally stabilizing. This makes it sub-
Future Directions stantially more robust than regular LQR for underactuated
In this paper, we have demonstrated motion planning for systems. This type of control was shown to handle moderate
the LittleDog quadruped robot to achieve bounding on very disturbances as well as small delays in simulation. Signifi-
rough terrain. The robot was modeled as a planar system cant delay and noise in the state estimate were shown to be

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


20 The International Journal of Robotics Research 00(000)

destabilizing in simulation, producing similar behavior to humanoid. In Proceedings of the IEEE/RAS International Con-
what is observed on the real robot. In the future, we plan to ference on Robotics and Automation (ICRA 05), Barcelona.
address these issues by implementing better state estimation Chevallereau C, Abba G, Aoustin Y, Plestan F, Westervelt
techniques, including a model-based observer, and continue ER, Canudas-De-Wit C, et al. (2003). Rabbit: a testbed for
the development of improved feedback control strategies. advanced control theory. IEEE Control Systems Magazine 23:
57–79.
In this work, the robot was assumed to be planar in order
Collins SH, Ruina A, Tedrake R and Wisse M (2005) Efficient
to reduce the state dimension for planning. As future work,
bipedal robots based on passive-dynamic walkers. Science 307:
the constraint that left and right feet move as one may be 1082–1085.
relaxed in order to extend this work to the 3D world where Fraichard T (2007) A short paper about motion safety. In Proceed-
terrain heights are not even on both feet. The feet may be ings of the IEEE/RAS International Conference on Robotics
commanded different heights in order to conform to the and Automation (ICRA 07), Rome, Italy, pp. 1140–1145.
terrain, while the planner assumes the foot position to be Fraichard T and Asama H (2003) Inevitable collision states. a step
the average of the left and right foot positions. It would towards safer robots? In Proceedings of the IEEE/RSJ Inter-
also be interesting to try to stabilize yaw and roll by asym- national Conference on Intelligent Robots and Systems (IROS
metrically changing the foot height on the right and left 03), Volume 1, Las Vegas, NV, pp. 388–393.
feet while bounding. Another direction for improving the Frazzoli E, Dahleh M and Feron E (2005) Maneuver-based motion
planning for nonlinear systems with symmetries. IEEE Trans-
planner would be to better characterize more of the reach-
actions on Robotics 21: 1077–1091.
able set, perhaps by incorporating knowledge about nearby
Frazzoli E, Dahleh MA and Feron E (2002) Real-time motion
terrain. Lastly, we believe the proposed motion planning planning for agile autonomous vehicles. Journal of Guidance,
approach can be generalized, and applied to a variety of Control, and Dynamics 25: 116–129.
other systems. In the near future, we expect to try a similar Gilardi G and Sharf I (2002) Literature survey of contact dynam-
planning algorithm on a dynamic biped to achieve walking ics modelling. Mechanism and Machine Theory 37:1213–1239.
over rough terrain, and on a forklift operating in a highly Hahn W (1967) Stability of Motion. Berlin: Springer-Verlag.
constrained environment. Hale JK (1980) Ordinary Differential Equations. New York:
Robert E. Krieger Publishing Company.
Harada K, Hattori S, Hirukawa H, Morisawa M, Kajita S
Acknowledgements and Yoshida E (2007) Motion planning for walking pattern
This work was supported by the DARPA Learning Locomo- generation of humanoid. In Proceedings of the IEEE/RSJ
tion program (AFRL contract # FA8650-05-C-7262). The International Conference on Intelligent Robots and Systems
authors would also like to thank Sara Itani for her help dur- (IROS 07), pp. 4227–4233.
Hauser J and Chung CC (1994) Converse Lyapunov functions
ing Phase III, especially in tuning an open-loop bounding
for exponentially stable periodic orbits. Systems and Control
gate. Letters 23: 27–34.
Hauser K (2008) Motion Planning for Legged and Humanoid
Robots. Ph.D. thesis, Stanford.
References Hauser K, Bretl T, Latombe J-C, Harada K and Wilcox B (2008)
Altendorfer R, Moore N, Komsuoglu H, Buehler M, Brown HB, Motion planning for legged robots on varied terrain. The
McMordie D, et al. (2001) RHex: A biologically inspired International Journal of Robotics Research 27: 1325–1349.
hexapod runner. Autonomous Robots 11: 207–213. Hirose M and Ogawa K (2007) Honda humanoid robots devel-
Berges P and Bowling A (2006) Rebound, slip, and compliance opment. Philosophical Transactions of the Royal Society 365:
in the modeling and analysis of discrete impacts in legged 11–19.
locomotion. Journal of Vibration and Control 12: 1407–1430. Hunt KH and Crossley FRE (1975) Coefficient of restitution
Byl K, Shkolnik A, Prentice S, Roy N and Tedrake R (2008) Reli- interpreted as damping in vibroimpact. Journal of Applied
able dynamic motions for a stiff quadruped. In Proceedings of Mechanics Series E 42: 440–445.
the 11th International Symposium on Experimental Robotics Kajita S, Kanehiro F, Kaneko K, Fujiware K, Harada K, Yokoi
(ISER), Athens, Greece. K, et al. (2003) Biped walking pattern generation by using
Byl K and Tedrake R (2009) Dynamically diverse legged locomo- preview control of zero-moment point. In Proceedings of
tion for rough terrain. In Proceedings of the IEEE/RAS Inter- the IEEE/RAS International Conference on Robotics and
national Conference on Robotics and Automation (ICRA 09), Automation (ICRA 03), Taipei, Taiwan, pp. 1620–1626.
Kobe, Japan. Kaneko K, Kanehiro F, Kajita S, Hirukawa H, Kawasaki T, Hirata
Chan N, Kuffner J and Zucker M (2008) Improved motion plan- M, et al. (2004) Humanoid robot HRP-2. In Proceedings
ning speed and safety using regions of inevitable collision. In of the IEEE/RAS International Conference on Robotics and
17th CISM-IFToMM Symposium on Robot Design, Dynamics, Automation (ICRA 04), New Orleans, LA, pp. 1083–1090.
and Control (RoManSy’08), Tokyo, Japan. Kavraki L, Svestka P, Latombe J and Overmars M (1996)
Cheng P (2005) Sampling-based Motion Planning with Differen- Probabilistic roadmaps for path planning in high-dimensional
tial Constraints. Ph.D. thesis, University of Illinois at Urbana- configuration spaces. IEEE Transactions on Robotics and
Champaign, IL. Automation 12(4): 566–580.
Chestnutt J, Lau M, Cheung KM, Kuffner J, Hodgins JK and Kolter JZ, Rodgers MP and Ng AY (2008) A control architecture
Kanade T (2005) Footstep planning for the honda asimo for quadruped locomotion over rough terrain. In Proceedings

Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011


Shkolnik et al. 21

of the IEEE/RAS International Conference on Robotics and Shkolnik A and Tedrake R (2009) Path planning in 1000+
Automation (ICRA 08), Pasadena, CA, pp. 811–818. dimensions using a task-space Voronoi bias. In Proceedings
Kuffner JJ, Kagami S, Nishiwaki K, Inaba M and Inoue H (2002) of the IEEE/RAS International Conference on Robotics and
Dynamically-stable motion planning for humanoid robots. Automation (ICRA 09), Kobe, Japan.
Autonomous Robots 12: 105–118. Shkolnik A, Walter M and Tedrake R (2009) Reachability-
Kuffner JJ, Nishiwaki K, Kagami S, Inaba M and Inoue H guided sampling for planning under differential constraints.
(2003) Motion planning for humanoid robots. In Dario P and In Proceedings of the IEEE/RAS International Conference
Chatila R (Eds), ISRR (Springer Tracts in Advanced Robotics, on Robotics and Automation (ICRA 09), Kobe, Japan, pp.
volume 15). Berlin: Springer, pp. 365–374. 2859–2865.
LaValle S and Branicky M (2002) On the relationship between Song G and Zefran M (2006) Stabilization of hybrid periodic
classical grid search and probabilistic roadmaps. In Pro- orbits with application to bipedal walking. In Proceedings of
ceedings of the Workshop on the Algorithmic Foundations of the 2006 American Control Conference, Minneapolis, MN.
Robotics, Nice, France. Sugihara T (2004) Mobility Enhancement Control of Humanoid
LaValle SM (2006) Planning Algorithms. Cambridge: Cambridge Robot based on Reaction Force Manipulation via Whole Body
University Press. Motion. Ph.D. thesis, University of Tokyo.
LaValle SM and Kuffner JJ (2001) Randomized kinodynamic Westervelt ER, Grizzle JW, Chevallereau C, Choi JH and Mor-
planning. The International Journal of Robotics Research 20: ris B (2007) Feedback Control of Dynamic Bipedal Robot
378–400. Locomotion. Boca Raton, FL: CRC Press.
Manchester, IR (2010) Transverse dynamics and regions of Westervelt ER, Grizzle JW and Koditschek DE (2003) Hybrid
stability for nonlinear hybrid limit cycles. arXiv:1010.2241 zero dynamics of planar biped walkers. IEEE Transactions on
[math.OC]. Automatic Control 48: 42–56.
Manchester IR, Mettin U, Iida F and Tedrake R (2009) Stable Yershova A, Jaillet L, Simeon T and LaValle S (2005) Dynamic-
dynamic walking over rough terrain: Theory and experiment. domain RRTs: Efficient exploration by controlling the
In Proceedings of the International Symposium on Robotics sampling domain. Proceedings of the IEEE/RAS International
Research (ISRR), Lucerne, Switzerland. Conference on Robotics and Automation (ICRA 05), pp.
Marhefka D and Orin D (1996) Simulation of contact using a 3856–3861.
nonlinear damping model. In Proceedings of the IEEE/RAS Zucker M (2009) A Data-Driven Approach to High Level Plan-
International Conference on Robotics and Automation (ICRA ning. PhD thesis, Carnegie Mellon University, available at
96), Vol. 2, Minneapolis, MN, pp. 1662–1668. http://www.cs.cmu.edu/ mzucker/mz-thesis-proposal.pdf.
Pongas D, Mistry M and Schaal S (2007) A robust quadruped
walking gait for traversing rough terrain. In Proceedings of
the IEEE/RAS International Conference on Robotics and
Automation (ICRA 07). Appendix A: Ground Interaction Model
Pratt JE and Tedrake R (2005) Velocity based stability margins for
fast bipedal walking. In Proceedings of the First Ruperto Car-
This appendix gives on overview of the computation of
ola Symposium on Fast Motions in Biomechanics and Robotics: ground contact forces for the LittleDog model.
Optimization and Feedback Control, Vol. 340, pp. 299–324. Ground contact models can be discrete or continuous (see
Raibert M, Blankespoor K, Nelson G, Playter R and the Big- Gilardi and Sharf (2002) for an overview). Discrete colli-
Dog Team (2008) Bigdog, the rough-terrain quadruped robot. sion modeling can range from using a constant coefficient
Proceedings of the 17th World Congress of the International of elasticity to more advanced approaches that can predict
Federation of Automatic Control. slipping behavior and the presence or absence of bounce
Raibert M, Chepponis M and Brown H (1986) Running on four (Berges and Bowling 2006). Discrete modeling is advan-
legs as though they were one. IEEE Journal of Robotics and tageous because of its simplicity, but is not well suited
Automation 2: 70–82. to LittleDog, because it assumes an instantaneous change
Raibert MH (1986) Legged Robots That Balance. Cambridge,
in momentum, whereas on the robot compression of shin
MA: The MIT Press.
Ratliff N, Zucker M, Bagnell JAD and Srinivasa S (2009) Chomp:
springs extends the collision duration to a time scale com-
Gradient optimization techniques for efficient motion planning. parable with the rest of LittleDog dynamics. Continuous
In Proceedings of the IEEE/RAS International Conference on impact modeling is more suited for LittleDog and can be
Robotics and Automation (ICRA 09), Kobe, Japan. subdivided into modeling the forces normal and tangential
Rebula J, Neuhaus P, Bonnlander B, Johnson M and Pratt J to the surface.
(2007) A controller for the littledog quadruped walking on The continuous ground contact model presented here
rough terrain. Proceedings of the IEEE/RAS International carefully computes the interaction of LittleDog feet with
Conference on Robotics and Automation (ICRA 07). rough terrain, allowing it to predict shin-spring displace-
Sakagami Y, Watanabe R, Aoyama C, Matsunaga S and Fujimura ment, foot roll, foot slip, compliance and energy dissipation
NHK (2002) The intelligent ASIMO: system overview and during ground collision, and bounce when too little energy
integration. In Proceedings of the IEEE/RSJ International
is dissipated.
Conference on Intelligent Robots and Systems (IROS 02),
Vol. 3, Switzerland, pp. 2478–2483.
Shiriaev AS, Freidovich LB and Manchester IR (2008) Can we A.1. Terrain Model and Foot Roll
make a robot ballerina perform a pirouette? Orbital stabiliza-
tion of periodic motions of underactuated mechanical systems. The feet on LittleDog are small rubber balls about 2 cm in
Annual Reviews in Control 32: 200–211. diameter. When the angle of the leg to the terrain changes,
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
22 The International Journal of Robotics Research 00(000)

the ball rolls, producing a noticeable displacement. This is 0.8


equivalent to a movement of the ball’s center along a dif- 0.6
ferent terrain, which is offset from the original by the foot
radius. To account for this effect, given a terrain height map, 0.4
γ ( x), a new terrain height map, γ ∗ ( x), is computed such
0.2
that every point on it is exactly one ball radius, rb , away

Ff/N
from the original terrain: 0
 
rb = minz ( γ ∗ ( x) −γ ( z) )2 + ( x − z)2 , ∀x −0.2
(8)
γ ∗ ( x) > γ ( x) , ∀x
−0.4
Both height maps can be seen in Figure 3, where the bottom
−0.6
light blue horizontal line is γ ( x), the original terrain, and
the black line above it is γ ∗ ( x), the terrain computed by −0.8
Equation (8). In this case, γ ∗ ( x) = γ ( x) +rb , but this is not −0.1 −0.05 0 0.05 0.1 0.15
generally true for non-flat terrain. Ground Contact Velocity (m/s)
When a LittleDog foot rolls, the velocity of the foot cen- Fig. 20. Friction coefficient versus steady-state velocity on an
ter is different from the velocity at the ground contact by inclined plane.
rb θ̇ , where θ̇ is the absolute angular velocity of the foot.
The new height map and the adjustment to ground contact
velocity completely capture the foot roll behavior.
All ground contact computations use the new height map,
γ ∗ ( x), referred to as “the terrain” below. A function for the
slope of the terrain, α( x), is computed from γ ∗ ( x).

A.2. Ground Friction Model


The friction force between the ground and the feet is
assumed to be a smooth function of velocity and to be
tangent to the ground surface:
Ff
= Kf arctan( Kd ṡ) . (9)
N
Here, Ff is the friction force, N is the surface normal force,
ṡ is the ground contact velocity, and Kf and Kd are model
parameters.
To fit the data, the robot was commanded to hold its
legs straight and placed on an inclined surface. The steady- Fig. 21. Ground contact model. A foot with a center at [px , py ] is
state velocity as well as the normal and tangential forces attached to a shin spring of length l at an angle of θ to the terrain.
were measured for a variety of slopes and are shown in The foot penetrates a distance h into the ground, which is mod-
Figure 20, along with a fit of the friction model. eled as a compressible plane. A velocity-dependent friction force
For high magnitudes of velocity, the friction force equa- is applied at the point of ground contact. The terrain angle at the
tion resembles that of Coulomb friction. As the magnitude ground contact point is equal to α, and is not shown in the figure.
decreases, the force drops off to smoothly change direc-
tion at zero velocity. The smoothness of the function is
important for integration purposes and seems to be a good the top right, connected to the shin spring of length l, which
approximation except for extremely small velocities. At is modeled as a non-linear spring damper. Connected to the
small velocities, the friction coefficient is small and might shin spring is the foot, the center of which is shown below
produce drift, but it is negligible in typical timescales of the ground in the figure at position [px , py ]. The center of
a simulation run (less than 1 minute). An arc-tangent was each foot is computed from the current state of the robot.
selected for the functional form because it fits the available Whenever a foot center [px , py ] is below the terrain, py <
data well, but other sigmoids could be used. γ ∗ ( px ), the foot is considered to be in collision.
The foot below the shin spring is assumed to be massless.
A.3. Ground Forces Computation Therefore, the sum of forces acting on it is zero and is given
by
A diagram of a foot in collision with flat terrain is shown in
Figure 21. The figure shows a portion of the robot shin in F  +M
f + N  +P
 = 0, (10)
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
Shkolnik et al. 23

where Ff is the friction force, N is the normal force, M is the between the spring and the ground is θ −α, and substituting
shin-spring force, and P is the perpendicular force applied (9) into (10) gives
on the foot by the spring housing.
0 = Ff cos( θ − α) + N( x, l̇ ) sin( θ − α) −M( x, l̇ )
The normal force model uses a non-linear spring damper
of the form introduced by Hunt and Crossley (1975): = [Kf arctan ( Kd ṡ( x, l̇ ) cos( θ − α) ) + sin( θ − α) ]
N( x, l̇ ) −M( x, l̇ ) , (15)
N = Kh h( 1 − ζh ḣ) , (11)
which is just a function of the robot state and l̇, and where
where h is the penetration depth, ḣ is the rate of change N( x, l̇)and M( x, l̇) are given by Equations (11) and (14),
of the penetration, and Kh and ζh are constants. Compared respectively.
with linear damping, it has the advantage of being continu- By approximating the arc-tangent function, Equation
ous across the ground contact and avoiding sticking forces (15) is used to find l̇, which is then used to find the normal
between surfaces for almost all cases (Marhefka and Orin and friction forces using Equations (11) and (9). The forces
1996). The penetration depth, h, is computed as the shortest are then applied to the appropriate point of the rigid five-
distance between the foot center and the height map, γ ∗ ( x), link model and l̇ is used to update the shin-spring length.
and is perpendicular to the height map. Since on the actual Although for some parameters and system states, Equa-
robot the foot cannot overlap the terrain, it is assumed that tion (15) might have multiple solutions for l̇, in practice,
any overlap is due to compliance in the leg, the rubber foot, the l̇ with the lower magnitude can be chosen as the physi-
or the ground. cally plausible solution. All of the ground interaction forces
Note that ḣ is an algebraic function of [p˙x , p˙y ], the foot are dissipative, so the dynamics are guaranteed to remain
center velocity, which in turn is an algebraic function of the stable.
known robot state and l̇. Unlike l, l̇ is not a part of the state, For a foot not in collision, no forces are applied on the
so ḣ cannot be computed directly. The normal force is affine foot, so Ff = N = 0 → M = 0. The rate of change of the
in l̇, so, for a robot in state x, Equation (11) can be rewritten shin length is then, from Equation (14),
as Ml ( x)
l̇ = − , (16)
N = Nx ( x) +Nl ( x) l̇, (12) Mx ( x)

where Nx ( x) an Nl ( x) are non-linear functions of the state. which is a fast, stable non-linear system that drives l to l0
The actual shin spring on the robot is limited in its range quickly after the foot leaves the ground.
of travel. During a bounding motion, it is typical for the
spring to reach the limits of motion, where it hits a hard Appendix B: Model Parameters
stop. The spring is modeled as linear in its normal range
The motor model was assumed to be independent of the
and to have a hard collision at the travel limits of the same
other parameters and fit to real joint trajectory data. The
functional form as the normal force. Assuming a rest length
fit accurately predicts the behavior of the joints, as seen in
of l0 , the displacement from rest is δl = l −l0 , and the range
Figure 4, which shows the model performance on a dif-
of travel for δl is between 0 and lmax , the force is given by
ferent trajectory. The friction coefficients in the ground
⎧ force model were fit to steady-state sliding as described

⎨Ks δl + bs l̇ + Kc δl( 1 + ζl l̇ ) , δl < 0
in Appendix A. The rest of the ground contact model was
M = Ks δl + bs l̇, 0 ≤ δl < lmax identified by commanding the robot to hold its legs straight


Ks δl + bs l̇ + Kc ( δl − lmax ) ( 1 − ζl l̇ ) , lmax ≤ δl, down, parallel to each other, dropping it vertically onto
(13) flat terrain, and fitting the parameters to the resulting body
where Ks and Kc
Ks are stiffness parameters, and bs and trajectory.
ζl are damping parameters. Similarly to the normal force, The total mass of the robot, the lengths of each link, and
the spring force is affine in l̇ and can be written as the maximum shin spring travel were measured directly.
The remaining parameters, including inertias of the links,
M = Mx ( x) +Ml ( x) l̇ (14) the mass distribution between the links, and COM loca-
tions, were fit to a large number of short bounding trajecto-
for some non-linear functions of the state Mx ( x) and Ml ( x). ries. The cost function for the fit was a quadratic form on the
The friction force is given by Equation (9), where ṡ is the distance between actual and simulated feet positions, which
velocity of the foot center along the height map γ ∗ ( x) and captures the effect of the three unactuated variables (x, y,
can be computed from the state of the robot x, the slope of and body pitch), neglecting the unactuated shin springs that
the terrain at the ground contact α, and l̇. are not considered to be a part of the configuration.
The force applied to the foot by the spring housing, P, All of the fits were computed with non-linear func-
is unknown, but can be eliminated from the force balance tion optimization (using MATLAB’s fminsearch). In total,
by only considering the component of Equation (10) that 34 parameters were fit for the model. Table 1 lists the
is orthogonal to P. Noting that M ⊥ P, Ff ⊥ N, the angle parameters and their values.
Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011
24 The International Journal of Robotics Research 00(000)

Table 1. LittleDog model parameters. For the rigid body model, the parameters are heavily cou-
SymbolValue Units Description pled and some of the individual values might not be accu-
rate. This is true of the inertias and to some degree of the
Rigid-body model centers of masses. Because the planar model lumps two Lit-
Shin
tleDog legs into a single leg, the leg masses and inertias,
as well as some of the stiffnesses and damping values, are
m1 0.13 kg Mass of shin twice as large as their physical counterparts for a single leg.
l1 9.2 cm Length of shin The length of the shin is given from the knee joint to
I1 7 × 10−5 kg m2 Inertia of shin the foot center, assuming full extension of the shin spring.
cx1 5.7 cm Center of mass in x for shin Centers of masses are given in the reference frames of their
cy1 0.3 cm Center of mass in y for shin links, with x pointing along its link and y perpendicular to
Upper leg it in a right-handed convention. The origin of the back shin
is at the back foot and x̂ points toward the knee joint, the
m2 0.24 kg Mass of upper leg origin of the back upper leg is at the knee joint and its x̂
l2 7.5 cm Length of upper leg points toward the hip joint, and the origin of the body is at
I2 6 × 10−6 kg m2 Inertia of upper leg the back hip joint, with its x̂ pointing toward the front hip
cx2 4.8 cm Center of mass in x for upper leg joint. The front links have mirror symmetry with the back
cy2 −0.9 cm Center of mass in y for upper leg
legs.
Main body

m3 2.3 kg Mass of body Appendix C: Index to Multimedia Extensions


l3 20.2 cm Length of body The multimedia extension page is found at http://www.
I3 2.2 × 10−3kg m2 Inertia of body ijrr.org
cx3 8.7 cm Center of mass in x for body
cy3 −0.2 cm Center of mass in y for body Table 2. Table of Multimedia Extensions
Motor model Extension Type Description

Hip joint 1 Video Simulation results: demonstration


of motion plans over rough
khip 3,800 s−2 Hip gain terrain, and feedback stabilization
bhip 98 s−1 Hip damping
v̄hip 7.9 rad s−1 Hip velocity saturation
āhip 200 rad s−2 Hip acceleration saturation

Knee joint

kknee 9,700 s−2 Knee gain


bknee 148 s−1 Knee damping
v̄knee 12 rad s−1 Knee velocity saturation
āknee 430 rad s−2 Knee acceleration saturation

Ground contact model

Friction

Kf 0.5 — Friction gain


Kd 1,000 s m−1 Friction velocity slope

Normal spring

Kh 1.4 × 105 N m−1 Ground stiffness


ζh 1.4 s m−1 Ground damping

Shin spring

Ks 7,500 N m−1 Spring linear stiffness


bs 180 Ns m−1Spring linear damping
Kc 7,500 N m−1 Spring limit stiffness
ζl 180 s m−1 Spring limit damping
lmax 0.88 cm Maximum spring travel
(spring limit)
β 0.29 rad Angle between spring and shin joint
rb 1.0 cm Foot radius Downloaded from ijr.sagepub.com at MASSACHUSETTS INST OF TECH on January 25, 2011

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy