Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states.
The Challenge
Waymo’s simulator is one of the most complex virtual environments ever built. It blends deterministic logic, physical dynamics, and state-of-the-art Generative AI to create a training ground for the Waymo Driver. The Simulator Evaluation team faces the ultimate data challenge: How do you mathematically prove that a virtual world is "real"?
We are seeking visionary machine learning engineers and researchers to architect the scalable deep learning systems, novel data workflows, and eval tools that power our research roadmap. In this role, you will pioneer the machine learning and generative vision paradigms required to define and measure the realism of our multimodal world models. Your work will define the state of the art for autonomous simulation, directly steering our research trajectory and the capabilities of the Waymo Driver.
You will:
- Lead the design, development and deployment of cutting-edge evaluation approaches to assess realism of state-of-the-art multimodel world models and generative systems for simulation use cases at Waymo.
- Architect and implement robust and scalable machine learning pipelines for tuning, evaluating, and deploying large-scale discriminator models for the purposes of simulator realism evaluation.
- Evaluate open-source and production-ready video generation techniques that measure realism (e.g. temporal stability, multi-modal consistency, geometric discrepancy, condition following, etc.)
- Apply vision language models to evaluate semantic understanding and controllability across our world simulation products.
- Collaborate with research teams across Waymo and Alphabet to integrate advancements in 4D world modeling and generative AI into production systems.
- Mentor engineers on the team and provide technical guidance on architecture and execution.
You have:
- Bachelor's, Master's, or PhD in computer science, machine learning, robotics, or a related field.
- Five or more years of experience in machine learning engineering or applied deep learning, supported by a portfolio of shipped products or peer-reviewed publications.
- Proficient programming skills in Python and hands-on experience with modern machine learning frameworks such as Jax, Flax, or PyTorch.
- Experience designing and implementing evaluation frameworks for complex systems or machine learning models.
We prefer:
- Track record of training large-scale generative models (diffusion models, flow matching, vision language models, etc.)
- A PhD and demonstrated success delivering machine learning products focused on 3D generative models, world models, or video generation.
- Experience simulating sensor data, including camera, lidar, and radar, or modeling semantic scenes.
- Experience developing autonomous systems, robotics software, or autonomous vehicle simulations.
- Experience training and optimizing large-scale models on GPU or TPU clusters for efficient production serving.
- Professional experience writing C++ for high-performance production systems.