Paper | Supplementary | Talk | Poster | Slides
This repository contains the code for the PAMI 2023 paper TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving. This work is a journal extension of the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. The code for the CVPR 2021 paper is available in the cvpr2021 branch.
If you find our code or papers useful, please cite:
@article{Chitta2023PAMI,
author = {Chitta, Kashyap and
Prakash, Aditya and
Jaeger, Bernhard and
Yu, Zehao and
Renz, Katrin and
Geiger, Andreas},
title = {TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving},
journal = {Pattern Analysis and Machine Intelligence (PAMI)},
year = {2023},
}
@inproceedings{Prakash2021CVPR,
author = {Prakash, Aditya and
Chitta, Kashyap and
Geiger, Andreas},
title = {Multi-Modal Fusion Transformer for End-to-End Autonomous Driving},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021}
}
Also, check out the code for other recent work on CARLA from our group:
- Jaeger et al., Hidden Biases of End-to-End Driving Models (ICCV 2023)
- Renz et al., PlanT: Explainable Planning Transformers via Object-Level Representations (CoRL 2022)
- Hanselmann et al., KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients (ECCV 2022)
- Chitta et al., NEAT: Neural Attention Fields for End-to-End Autonomous Driving (ICCV 2021)
Clone the repo, setup CARLA 0.9.10.1, and build the conda environment:
git clone https://github.com/autonomousvision/transfuser.git
cd transfuser
git checkout 2022
chmod +x setup_carla.sh
./setup_carla.sh
conda env create -f environment.yml
conda activate tfuse
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu102.html
pip install mmcv-full==1.5.3 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.11.0/index.html
Our dataset is generated via a privileged agent which we call the autopilot (/team_code_autopilot/autopilot.py
) in 8 CARLA towns using the routes and scenario files provided in this folder. See the tools/dataset folder for detailed documentation regarding the training routes and scenarios. You can download the dataset (210GB) by running:
chmod +x download_data.sh
./download_data.sh
The dataset is structured as follows:
- Scenario
- Town
- Route
- rgb: camera images
- depth: corresponding depth images
- semantics: corresponding segmentation images
- lidar: 3d point cloud in .npy format
- topdown: topdown segmentation maps
- label_raw: 3d bounding boxes for vehicles
- measurements: contains ego-agent's position, velocity and other metadata
In addition to the dataset itself, we have provided the scripts for data generation with our autopilot agent. To generate data, the first step is to launch a CARLA server:
./CarlaUE4.sh --world-port=2000 -opengl
For more information on running CARLA servers (e.g. on a machine without a display), see the official documentation. Once the server is running, use the script below for generating training data:
./leaderboard/scripts/datagen.sh <carla root> <working directory of this repo (*/transfuser/)>
The main variables to set for this script are SCENARIOS
and ROUTES
.
The code for training via imitation learning is provided in train.py.
A minimal example of running the training script on a single machine:
cd team_code_transfuser
python train.py --batch_size 10 --logdir /path/to/logdir --root_dir /path/to/dataset_root/ --parallel_training 0
The training script has many more useful features documented at the start of the main function. One of them is parallel training. The script has to be started differently when training on a multi-gpu node:
cd team_code_transfuser
CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=16 OPENBLAS_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node=2 --max_restarts=0 --rdzv_id=1234576890 --rdzv_backend=c10d train.py --logdir /path/to/logdir --root_dir /path/to/dataset_root/ --parallel_training 1
Enumerate the GPUs you want to train on with CUDA_VISIBLE_DEVICES. Set the variable OMP_NUM_THREADS to the number of cpus available on your system. Set OPENBLAS_NUM_THREADS=1 if you want to avoid threads spawning other threads. Set --nproc_per_node to the number of available GPUs on your node.
The evaluation agent file is build to evaluate models trained with multiple GPUs. If you want to evaluate a model trained with a single GPU you need to remove this line.
We make some minor modifications to the CARLA leaderboard code for the Longest6 benchmark, which are documented here. See the leaderboard/data/longest6 folder for a description of Longest6 and how to evaluate on it.
Pre-trained agent files for all 4 methods can be downloaded from AWS:
mkdir model_ckpt
wget https://s3.eu-central-1.amazonaws.com/avg-projects/transfuser/models_2022.zip -P model_ckpt
unzip model_ckpt/models_2022.zip -d model_ckpt/
rm model_ckpt/models_2022.zip
To evaluate a model, we first launch a CARLA server:
./CarlaUE4.sh --world-port=2000 -opengl
Once the CARLA server is running, evaluate an agent with the script:
./leaderboard/scripts/local_evaluation.sh <carla root> <working directory of this repo (*/transfuser/)> <v2v fusioin mode (none or raw_fusion)>
By editing the arguments in local_evaluation.sh
, we can benchmark performance on the Longest6 routes. You can evaluate both privileged agents (such as [autopilot.py]) and sensor-based models. To evaluate the sensor-based models use submission_agent.py as the TEAM_AGENT
and point to the folder you downloaded the model weights into for the TEAM_CONFIG
. The code is automatically configured to use the correct method based on the args.txt file in the model folder.
By switching the third argument between none
and raw_fusion
, we can evaluate the performance of different lidar fusion mode.
You can look at qualitative examples of the expected driving behavior of TransFuser on the Longest6 routes here.
To compute additional statistics from the results of evaluation runs we provide a parser script tools/result_parser.py.
${WORK_DIR}/tools/result_parser.py --xml ${WORK_DIR}/leaderboard/data/longest6/longest6.xml --results /path/to/folder/with/json_results/ --save_dir /path/to/output --town_maps ${WORK_DIR}/leaderboard/data/town_maps_xodr
It will generate a results.csv file containing the average results of the run as well as additional statistics. It also generates town maps and marks the locations where infractions occurred.
To submit to the CARLA leaderboard you need docker installed on your system. Edit the paths at the start of make_docker.sh. Create the folder team_code_transfuser/model_ckpt/transfuser. Copy the model.pth files and args.txt that you want to evaluate to team_code_transfuser/model_ckpt/transfuser. If you want to evaluate an ensemble simply copy multiple .pth files into the folder, the code will load all of them and ensemble the predictions.
cd leaderboard
cd scripts
./make_docker.sh
The script will create a docker image with the name transfuser-agent. Follow the instructions on the leaderboard to make an account and install alpha.
alpha login
alpha benchmark:submit --split 3 transfuser-agent:latest
The command will upload the docker image to the cloud and evaluate it.