Navigating bottlenecks: infrastructure lessons from AV ML systems

27 Aug 2025

14:00 - 14:25

Room 2

Developments in AI, architecture and software

Autonomous vehicle (AV) ML systems demand infrastructure that can handle real-time perception, high-throughput data and latency-critical workloads. While model optimization gets much attention, infrastructure bottlenecks often define system performance. This presentation shares lessons from scaling AV ML pipelines using Kubernetes-native tools. It will cover orchestration with Dagster, distributed execution via Ray, and dynamic GPU scaling with Kueue and KubeRay. From cloud-based fleet learning to edge-deployed perception, the presentation will explore how to balance performance, cost and developer velocity. If you’re building or maintaining AV ML systems, this session offers practical strategies to move fast, without compromising safety or scalability.

Infrastructure-first strategies for scaling ML systems in autonomous vehicles – beyond just model compression
How to design resilient, cost-efficient ML pipelines using Kubernetes-native tools like Dagster, Ray, Kueue and KubeRay
Techniques to handle bursty inference, real-time perception and multistage workloads in both edge and cloud deployments
How to optimize GPU utilization without over-provisioning, through dynamic orchestration and auto-scaling
Lessons from real-world AV systems that balance performance, developer velocity and system reliability

Speakers

Yashovardhan Chaturvedi, machine learning engineer II - Torc Robotics

BOOK A DELEGATE PASS

Autonomous Vehicle Technology Expo San Jose - Conference

Navigating bottlenecks: infrastructure lessons from AV ML systems