Wei Zhan serves as a Co-Director of Berkeley DeepDrive, one of the leading research centers in the field of AI for autonomy and mobility involving many Berkeley faculty and industrial partners. He is an Assistant Professional Researcher at UC Berkeley leading a team of Ph.D. students and Postdocs conducting research. His research is focused on AI for autonomous systems leveraging control, robotics, computer vision and machine learning techniques to tackle challenges with sophisticated dynamics, interactive human behavior and complex scenes in a scalable way. He also teaches AI for Autonomy at UC Berkeley.

He is also Chief Scientist of Applied Intuition, leading AI research efforts towards next-generation autonomy and its development toolchain. He is actively hiring Research Scientists, Research Engineers and Research Interns.

He received his Ph.D. degree from UC Berkeley. His publications received the Best Student Paper Award in IV’18, Best Paper Award – Honorable Mention of IEEE Robotics and Automation Letters, and Best Paper Award of ICRA’24. One of his publications also got ICLR’23 notable top 5% oral presentation. He led the construction of INTERACTION dataset and the organization of its prediction challenges in NeurIPS’20 and ICCV’21.

Research and Projects

Current

Pretrained Policy Customization and Multi-Agent RL

Autonomous Racing – Learning to Plan and Control at the Limits

  • Active exploration for modeling dynamics and racing behavior: IEEE Trans-CST ’24, arxiv
  • Skill-Critic – refining learned skills for reinforcement learning: RA-Letters ’24, arxiv, Website
  • BeTAIL- behavior transformer adversarial imitation learning: RA-Letters ’24 (accepted), arxiv, Website
  • Double-iterative GP for model error compensation: IFAC’23, arxiv
  • Outracing human racers with MPC: arxiv

3D Reconstruction, Localization and Novel View Synthesis

  • Self-Supervised 3D Gaussian Splatting for Autonomous Driving: arxiv, Code
  • Q-SLAM – quadric representations for monocular SLAM: CoRL’24, arxiv
  • Quadric representations for LiDAR odometry, mapping and localization: RA-Letters ’23, arxiv

Behavior and Scenario Generation for Closed-Loop Simulation

  • SAFE-SIM – Guided diffusion for traffic simulation with controllable criticality: ECCV’24, arxiv
  • Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation: ECCV’24, arxiv
  • Editing driver character with socially-controllable generation: RA-Letters ’23, arxiv
  • Diverse Critical Interaction Generation: IROS’21, arxiv
  • SceGene – bio-inspired scenario generation: IEEE Trans-ITS ’21

Language Modality and LLM for Autonomy

  • WOMD-Reasoning – A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning: arxiv, Website
  • Code diagnosis and repair of motion planners by LLM: RA-Letters’24, arxiv
  • LanguageMPC – LLMs as decision makers: arxiv, Website

Self-Supervised Learning for Differentiable Autonomy Stack

  • Cohere3D – temporal coherence for self-supervision of perception, prediction and planning: arxiv
  • Prediction with synthetic data pretraining: IROS’24, arxiv
  • PreTraM – self-supervision connecting trajectory and map: ECCV’22, arxiv, Code
  • Image2Point – 2D pretraining for 3D understanding: ECCV’22, arxiv, Code

Cross-Embodiment and Generalization for Robot Learning

  • Representation learning from general human demonstrations to robot manipulation: RSS’24 (accepted), arxiv, Website
  • Open X-Embodiment – Robotic Learning Datasets and RT-X Models: ICRA’24 (Best Paper Award), arxiv, Blog, Dataset, Website, Code
  • Sparse Diffusion Policy – A Sparse, Reusable, and Flexible Policy for Robot Learning: CoRL’24, arxiv
  • PhyGrasp – generalizing grasping with physics-informed large multimodal models: arxiv, Website

Recent and Continued

3D Perception with Temporal, Multi-View, and Multi-Modal Fusion

Efficient, Automated Data Engine and Training Pipeline

Generalizable Behavior Prediction and Represenation

Multi-Agent, Interactive Prediction with Interpretability

  • Multi-agent prediction combining egocentric and allocentric views: CoRL’21
  • Social posterior collapse in variational autoencoder: NeurIPS’21, arxiv
  • Interventional behavior prediction: IROS’22, arxiv
  • Interpretable goal-conditioned interactive prediction: IROS’22, arxiv

Past

INTERACTION Dataset and Benchmark

  • INTERACTION dataset with critical scenes and densely interactive behavior: Website, arxiv, IROS’19
  • INTERPRET challenge benchmarking conditional, multi-agent prediction: Website

LiDAR-based Perception

2D Perception

Vehicle Dynamics and Control

  • Dual Extended Kalman Filter for state and parameter estimation: ITSC’21
  • Remote control with slow sensor: Sensors ’19