Skip to content

vita-epfl/VoxDet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

[ArXiv 25] VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection

📌 This is the official PyTorch implementation of the work:

VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection
Wuyang Li 1 , Zhu Yu 2 , Alexandre Alahi 1
1 École Polytechnique Fédérale de Lausanne (EPFL); 2 Zhejiang University

VoxDet overview

Code is coming soon! We’re currently cleaning up the code and unifying the camera- and LiDAR-based implementations into a single project, which serves as a powerful, clean, and extensible baseline model for the community. If you can’t wait for the official release, feel free to contact me for the individual implementations.

Contact: wuyang.li@epfl.ch

✨ Highlight

VoxDet address semantic occupancy prediction with an instance-centric formulation inspried by dense object detection, which uses a Voxel-to-Instance (VoxNT) trick freely transferring voxel-level class labels to instance-level offset labels.

  • Versatile: Adaptable to various voxel-based scenarios, such as camera and LiDAR settings.
  • Powerful: Achieve joint state-of-the-art on both camera-based and LiDAR-based SSC benchmarks.
  • Efficient: Fast (~1.3× speed-up) and lightweight (reducing ~57.9% parameters).
  • Leaderboard Topper: Achieve 63.0 IoU (single-frame model), securing 1st place on the SemanticKITTI leaderboard.

Note that VoxDet is single-frame single-model method without extra data and labels.

VoxDet overview

📈 Training logs

VoxDet (blue curve) is significantly more efficient and effective than the previous state-of-the-art method, CGFormer (gray color).

VoxDet logs

🙏 Acknowledgement

Greatly appreciate the tremendous effort for the following projects!

📋 TODO List

  • Release the arXiv paper
  • Release the unified codebase, including both camera-based and LiDAR-based implementation
  • Release all models

📚Citeation

VoxDet overview

If you think our work is helpful for your project, I would greatly appreciate it if you could consdier citing our work

@article{li2025voxdet,
  title={VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection},
  author={Li, Wuyang and Yu, Zhu and Alahi, Alexandre},
  journal={arXiv preprint arXiv:2506.04623},
  year={2025}
}

About

[ArXiv 25] VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy