Wei Zhan serves as a Co-Director of Berkeley DeepDrive, one of the leading research centers in the field of AI for autonomy and mobility involving many Berkeley faculty and industrial partners. He is an Assistant Professional Researcher at UC Berkeley leading a team of Ph.D. students and Postdocs conducting research. His research is focused on AI for autonomous systems leveraging control, robotics, computer vision and machine learning techniques to tackle challenges with sophisticated dynamics, interactive human behavior and complex scenes in a scalable way. He also teaches AI for Autonomy at UC Berkeley.
He is also Chief Scientist of Applied Intuition, leading AI research efforts towards next-generation autonomy and its development toolchain. He is actively hiring Research Scientists, Research Engineers and Research Interns.
He received his Ph.D. degree from UC Berkeley. His publications received the Best Student Paper Award in IV’18, Best Paper Award – Honorable Mention of IEEE Robotics and Automation Letters, and Best Paper Award of ICRA’24. One of his publications also got ICLR’23 notable top 5% oral presentation. He led the construction of INTERACTION dataset and the organization of its prediction challenges in NeurIPS’20 and ICCV’21.
Research and Projects
Current
Pretrained Policy Customization and Multi-Agent RL
Autonomous Racing – Learning to Plan and Control at the Limits
- Active exploration for modeling dynamics and racing behavior: IEEE Trans-CST ’24, arxiv
- Skill-Critic – refining learned skills for reinforcement learning: RA-Letters ’24, arxiv, Website
- BeTAIL- behavior transformer adversarial imitation learning: RA-Letters ’24 (accepted), arxiv, Website
- Double-iterative GP for model error compensation: IFAC’23, arxiv
- Outracing human racers with MPC: arxiv
3D Reconstruction, Localization and Novel View Synthesis
- Self-Supervised 3D Gaussian Splatting for Autonomous Driving: arxiv, Code
- Q-SLAM – quadric representations for monocular SLAM: CoRL’24, arxiv
- Quadric representations for LiDAR odometry, mapping and localization: RA-Letters ’23, arxiv
Behavior and Scenario Generation for Closed-Loop Simulation
- SAFE-SIM – Guided diffusion for traffic simulation with controllable criticality: ECCV’24, arxiv
- Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation: ECCV’24, arxiv
- Editing driver character with socially-controllable generation: RA-Letters ’23, arxiv
- Diverse Critical Interaction Generation: IROS’21, arxiv
- SceGene – bio-inspired scenario generation: IEEE Trans-ITS ’21
Self-Supervised Learning for Differentiable Autonomy Stack
- Cohere3D – temporal coherence for self-supervision of perception, prediction and planning: arxiv
- Prediction with synthetic data pretraining: IROS’24, arxiv
- PreTraM – self-supervision connecting trajectory and map: ECCV’22, arxiv, Code
- Image2Point – 2D pretraining for 3D understanding: ECCV’22, arxiv, Code
Cross-Embodiment and Generalization for Robot Learning
- Representation learning from general human demonstrations to robot manipulation: RSS’24 (accepted), arxiv, Website
- Open X-Embodiment – Robotic Learning Datasets and RT-X Models: ICRA’24 (Best Paper Award), arxiv, Blog, Dataset, Website, Code
- Sparse Diffusion Policy – A Sparse, Reusable, and Flexible Policy for Robot Learning: CoRL’24, arxiv
- PhyGrasp – generalizing grasping with physics-informed large multimodal models: arxiv, Website
Recent and Continued
3D Perception with Temporal, Multi-View, and Multi-Modal Fusion
- SOLOFusion – temporal multi-view 3D detection: ICLR’23 (notable top 5% oral presentation), arxiv, Code
- SparseFusion – fusing multi-modal sparse representations: ICCV’23, arxiv, Code
- NeRF-Det – learning geometry-aware volumetric representation: ICCV’23, arxiv, Code
- Fusing BEV point cloud and front-view image: IV’18, arxiv, Code
Efficient, Automated Data Engine and Training Pipeline
Generalizable Behavior Prediction and Represenation
- Scenario-transferable semantic graph reasoning: IEEE Tran-ITS ’22, arxiv, Video summary
- Semantic intention representation: IV’18 (Best Student Paper Award), arxiv
- Transferable and adaptable prediction: NeurIPS’21 (ML4AD workshop spotlight), arxiv
- Causal-based time series domain generalization: ICRA’22, arxiv
- Generalizability analysis: IROS’22, arxiv
Multi-Agent, Interactive Prediction with Interpretability
Past
LiDAR-based Perception
- SqueezeSegV3 – spatially-adaptive convolution for segmentation: ECCV’20, arxiv, Code
- Labels Are Not Perfect – inferring spatial uncertainty in detection: IEEE Trans-ITS ’21, IROS’20, arxiv, Code
- Multi-task learning: IROS’21, Code
- YOGO – Efficient processing with token representation and relation inference: IROS’21, Code
2D Perception
- Sparse R-CNN: IEEE Trans-PAMI ’23, CVPR’21, arxiv
- Autoscale: IJCV ’22, arxiv
Vehicle Dynamics and Control
- Dual Extended Kalman Filter for state and parameter estimation: ITSC’21
- Remote control with slow sensor: Sensors ’19