Teller 10
Teller 10
Teller 10
operations, we have developed a voice-commandable au- real-world environments, outdoors on uneven terrain
tonomous forklift capable of executing a limited set of com- without reliance on precision GPS, and in close prox-
mands to approach, engage, transport and place palletized imity to people;
cargo in a minimally-structured outdoor setting. • Speech understanding in noisy environments;
Rather than carefully preparing the environment to make it • Indication of robot state and imminent actions to by-
such as military Supply Support Activities (outdoor ware- mon to human and robot; and
houses). The robot has to operate safely outdoors on uneven • Robust, closed-loop pallet manipulation using only local
In addition to converting the vehicle to drive-by-wire mounted cameras looking forward, left, right, and rearward
operation, we have added proprioceptive and exteroceptive in order to publish a 360◦ view of the forklift’s surround to
sensors, and audible and visible “annunciators” with which the supervisor’s tablet.
the robot can signal nearby humans. The system’s interface, For each lidar and camera, we estimate the 6-DOF rigid-
perception, planning, control, message publish-subscribe, body transformation relating that sensor’s frame to the body
and self-monitoring software (Fig. 2) runs as several dozen frame (the “extrinsic calibration”) through a chain of trans-
modules hosted on on-board laptop computers communicat- formations including all intervening actuatable degrees of
ing via message-passing over a standard network. A com- freedom. For each lidar and camera mounted on the forklift
modity wireless access point provides network connectivity body, this chain contains exactly one transform; for lidars
with the human supervisor’s handheld tablet computer. mounted on the mast, carriage, or tines, the chain has as
many as four transformations (e.g., sensor-to-tine, tine-to-
A. Proprioception mast, mast-to-carriage, and carriage-to-body).
We equipped the forklift with an integrated GPS/IMU unit
together with encoders mounted to the two (non-steering) C. Annunciation and Reflection
front wheels. The system relies mainly upon dead-reckoning We added LED signage, marquee lights, and audio speak-
for navigation, using the encoders and IMU to estimate short- ers to the exterior of the chassis and carriage, enabling the
term 6-DOF vehicle motion. Our smoothly-varying propri- forklift to “annunciate” its intended actions before carrying
oceptive strategy [21] incorporates coarse GPS estimates them out (§ V-A). The marquee lights also provide a “re-
largely for georeferenced topological localization. The fork flective display,” informing people nearby that the robot is
pose is determined from a tilt-angle sensor publishing to the aware of their presence (§ V-B), and using color coding to
Controller Area Network (CAN) bus and encoders measuring report other robot states.
tine height and lateral shift.
D. Computation
B. Exteroception Each proprioceptive and exteroceptive sensor is connected
For situational awareness and collision avoidance, we to one of four networked quad-core laptops. Three laptops
attached five lidars to the chassis in a “skirt” configuration, (along with the network switch, power supplies and relays)
facing forward-left and -right, left, right, and rearward, each are mounted in an equipment cabinet affixed to the roof, and
angled slightly downward so that the absence of a ground one is mounted behind the forklift carriage. A fifth laptop
return would be meaningful. We also attached five lidars in located in the operator cabin provides a diagnostic display.
a “pushbroom” configuration high up on the robot, oriented The supervisor’s tablet constitutes a distinct computational
downward and looking forward, forward-left and -right, and resource, maintaining a wireless connection to the forklift,
rearward-left and -right. We attached a lidar to each fork interpreting the supervisor’s spoken commands and stylus
tine, each scanning a half-disk parallel to and slightly above gestures, and displaying diagnostic information (§ IV-A).
that tine for pallet detection. We attached a lidar under the
chassis, scanning underneath the tines, allowing the forklift E. Software
to detect obstacles when cargo obscures the forward-facing We use a codebase originating in MIT’s DARPA Urban
skirts. We attached two vertically-scanning lidars outboard Challenge effort [22]. A low-level message-passing proto-
of the carriage in order to see around a carried load. We col [23] provides publish-subscribe inter-process commu-
attached beam-forming microphones oriented forward, left, nication among sensor handlers, the perception module,
right, and rearward to sense shouted warnings. Finally, we planner, controller, interface handler, and system monitoring
Fig. 3. A notional military warehouse layout.
A. Annunciation of Intent
The LED signage displays short text messages describing
current state (e.g., “paused” or “fault”) and any imminent
actions (e.g., forward motion or mast lifting). The marquee
Fig. 5. An approaching pedestrian causes the robot to pause. Lights
skirting the robot indicate distance to obstacles (green:far to red:close). lights encode forklift state as colors, and imminent motion
Verbal annunciators and signage indicate the induced pause. as moving patterns. Open-source software converts the text
messages to spoken English for broadcast through the audio
speakers. Text announcements are also exported to the tablet
environmental properties from failed returns (e.g., from ab- for display to the supervisor.
sorptive material). The consequence of the downward orien-
tation is a shorter maximum range, around 15 meters. Since B. Awareness Display
the vehicle’s speed does not exceed 2 m/s, this still provides The forklift also uses its annunciators to inform bystanders
7-8 seconds of sensing horizon for collision avoidance. that it is aware of their presence. Whenever a human is
To reject false positives from the ground (at distances detected in the vicinity, the marquee lights, consisting of
greater than the worst case ground slope), we require that strings of individually addressable LEDs, display a bright
consistent returns be observed from more than one lidar. region oriented in the direction of the detection (Fig. 5). If
Missing lidar returns are filled in at a reduced range to satisfy the estimated motion track is converging with the forklift, the
the conservative assumption that they arise from a human LED signage and speakers announce “Human approaching.”
(assumed to be 30 cm wide).
C. Autonomy Handoff
Pedestrian safety is central to our design choices. Though
lidar-based people detectors exist [25]–[27], we opted to When a human closely approaches the robot, it pauses
avoid the risk of misclassification by treating all objects for safety. (A speech recognizer runs on the forklift to
of suitable size as potential humans. The robot proceeds enable detection of shouted phrases such as “Forklift stop
slowly around stationary objects. Pedestrians who approach moving,” which also cause the robot to pause.) When a
too closely cause the robot to pause (Fig. 5), indicating as human (presumably a human operator) enters the cabin
such to the pedestrian. and sits down, the robot detects his/her presence in the
cabin through the report of a seat-occupancy sensor, or any
uncommanded press of the brake pedal, turn of the steering
C. Lidar-Based Servoing
wheel, or touch of the mast or transmission levers. In this
Picking up a pallet requires that the forklift accurately event, the robot reverts to behaving as a manned forklift,
insert its tines into the pallet slots, a challenge for a 2700 kg ceding autonomy.
forklift when the pallet’s pose and insert locations are not
VI. D EPLOYMENT AND R ESULTS
known a priori and when pallet structure and geometry
vary. Additionally, when the pallet is to be picked up from We deployed our system in two test environments con-
or placed on a truck bed, the forklift must account for figured as military Supply Support Activities (SSAs), in the
the unknown pose of the truck (distance from the forklift, general form shown in Fig. 3. These outdoor warehouses
orientation, and height), on which the pallet may be recessed. included receiving, bulk yard, and issuing areas connected by
Complicating these requirements is the fact that we have a simple road network. The bulk yards contained a number
only coarse extrinsic calibration for the mast lidars due to the of alphanumerically-labeled pallet storage bays.
unobservable compliance of the mast, carriage, and tines. We An Army staff sergeant, knowledgeable in military lo-
address these challenges with a closed-loop perception and gistics and an expert forklift operator, acted as the robot
control strategy that regulates the position and orientation of supervisor. In a brief training session, she learned how to
the tines based directly on lidar observations of the pallet provide speech and gesture input to the tablet computer, and
and truck bed. use its PAUSE and RUN buttons.
No No No