Computer Vision in AVs: Tesla FSD vs Waymo vs Mobileye (2026)

TL;DR: Self-driving cars run the same underlying computer-vision technology as autonomous drones — YOLO-class detectors, vision transformers, NVIDIA edge-AI chips — in a different operating context. The major divide is philosophical: Tesla uses a vision-only approach (cameras only, no LiDAR, end-to-end neural network); Waymo uses sensor fusion (29 cameras + multiple LiDARs + radar + centimeter-accurate HD maps). Mobileye, Wayve, Comma.ai, Cruise (now shutdown), Aurora, and Zoox occupy positions between those poles. The market is still being shaped by which approach actually reaches Level 4–5 autonomy at scale.
Why read: The vision-only vs sensor-fusion debate is the single most-important open question in applied AI. Whoever wins reshapes a $10+ trillion category.
Best for: Anyone tracking Tesla, Waymo, or the broader autonomous-vehicle industry; AI engineers connecting CV research to operational deployment; investors evaluating the AV space.
Skip if: You only care about hobbyist drones. Daily AI fundamentals in our free Beginners in AI newsletter.

The autonomous-vehicle industry is the largest deployment of computer vision in the physical world. Every Tesla running Full Self-Driving, every Waymo One robotaxi in Phoenix, San Francisco, Los Angeles, and Austin, every Mobileye-equipped EyeQ-powered car — all of them are running the same family of CV models we covered in our drone computer-vision post. The model architectures and the edge-compute hardware overlap heavily. The operating context, the safety stakes, and the sensor philosophies do not.

Here’s how the major AV approaches actually work in 2026.

What are the SAE levels of autonomy?

SAE LevelWhat the human doesWhat the car does2026 examples
Level 0Drives the whole timeWarnings onlyOlder cars without ADAS
Level 1Drives, supervises one assistOne automated function at a time (cruise OR lane-keep)Most pre-2020 cars with cruise control
Level 2Drives, supervises multiple assistsMultiple automated functions simultaneously (cruise + lane-keep)Tesla Autopilot, Tesla FSD (Supervised), GM Super Cruise, Ford BlueCruise
Level 3Hands off in defined conditions; takes over when promptedDrives in defined ODD (operational design domain), requests handover when limits reachedMercedes Drive Pilot (highway, <40 mph, Germany / California / Nevada)
Level 4Nothing in the ODDDrives without human in defined ODD; pulls over safely outside itWaymo One (geofenced robotaxi in Phoenix, SF, LA, Austin)
Level 5Nothing, everDrives anywhere a human couldDoes not yet exist

The levels are defined by SAE International standard J3016. The marketing language companies use for their products doesn’t always match the SAE level — Tesla’s “Full Self-Driving (Supervised)” is SAE Level 2 despite the name; Waymo’s service in Phoenix is genuine SAE Level 4 within its geofence; Mercedes Drive Pilot is the only consumer-buyable SAE Level 3 system in the US as of 2026.

Tesla’s vision-only approach

Tesla’s bet, articulated repeatedly by CEO Elon Musk: humans drive with two eyes, so AI should be able to drive with cameras. Tesla’s vehicles use eight cameras as the primary perception, no LiDAR, and (since the “Tesla Vision” transition completed in 2023) no radar. The full sensor suite is cameras plus ultrasonic sensors plus inertial measurement (and GPS for navigation only, not perception).

  • Hardware: “Hardware 4” (HW4) / “AI4” computers in current Tesla vehicles. The custom AI inference chip (Tesla’s own design, built on Samsung 7nm process) reportedly delivers ~250 TOPS — Trillion Operations Per Second, the standard unit for measuring how much AI math a chip can do, with bigger TOPS meaning bigger / smarter AI possible in the vehicle — of AI compute.
  • FSD v13: Released December 2024. First version that uses native camera resolution end-to-end. Available only on HW4-equipped vehicles. Has continued to update through 2025–2026.
  • Neural-network approach: Tesla replaced approximately 300,000 lines of hand-written C++ rules with a single end-to-end neural network that takes camera frames in and produces driving controls out. The training data is the behavior of 6+ million Tesla drivers.
  • Operational status: SAE Level 2 (driver supervised). Tesla has stated intent to move toward Level 4–5 but has not yet shipped a system that meets those standards.

The Tesla argument: HD maps and LiDAR don’t scale — you can’t map every road in the world precisely enough, and LiDAR is expensive and degrades. If a vision-only approach works, it works everywhere. The Tesla risk: vision-only has not yet been proven to reach the safety-margin required for Level 4 deployment.

Waymo’s multi-sensor fusion approach

Waymo (originally Google’s self-driving project, spun out as a subsidiary of Alphabet) takes the opposite philosophy: use every sensor available, pre-map the operational area precisely, and only operate where the system has been validated to safety standards.

  • Sensor suite: 29 cameras + multiple LiDARs (creating precise 3D point clouds at multiple ranges) + 6 radar units + centimeter-accurate HD maps + GPS + IMU.
  • Sensor fusion: The system matches LiDAR measurements to corresponding camera pixels, locating each object in three-dimensional space with high precision. Redundant sensing means the failure of any single sensor doesn’t blind the system.
  • HD maps: Every road Waymo operates on has been pre-mapped at centimeter accuracy — not just lane positions, but signs, signals, road geometry, drivable surfaces, drop-off zones.
  • Geofenced deployment: Waymo One robotaxi service operates only in mapped areas: Phoenix, San Francisco, Los Angeles, Austin (as of 2026) with expansion to additional cities ongoing.
  • Operational status: Genuine SAE Level 4 within the geofence. No safety driver in customer-facing rides as of 2024+.

The Waymo argument: full-stack autonomy is too hard for vision alone; redundant sensors plus pre-mapping is the only path to verifiable safety. The Waymo risk: the geofence-and-pre-map model requires huge per-city deployment cost and scales slowly.

How does Tesla’s and Waymo’s technology compare?

DimensionTesla FSDWaymo Driver
Cameras8 cameras (vision-only)29 cameras (multi-spectrum)
LiDARNoneMultiple LiDAR units
RadarNone (removed in 2021–2023)6 radar units
HD mapsNone — uses commercial nav maps onlyCentimeter-accurate pre-mapping required
AI compute onboardHW4 / AI4 (~250 TOPS Tesla custom)NVIDIA DRIVE-class compute + custom silicon
SAE LevelLevel 2 (Supervised)Level 4 (within geofence)
Deployment footprint~6M+ vehicles worldwide~700 robotaxis in 4 US metros
Scaling modelSoftware update to existing fleetCity-by-city mapping + fleet deployment
BackingTesla (NASDAQ: TSLA)Alphabet (Waymo subsidiary)

What is Mobileye?

Mobileye is an Israeli AV-technology company majority-owned by Intel (with a 2022 partial IPO). It supplies its EyeQ system-on-chip and software stack to dozens of automakers including BMW, Audi, Volkswagen, Ford, GM, Honda, Hyundai, and Nissan. Mobileye is the largest supplier of ADAS (Advanced Driver Assistance Systems) chips in the world.

  • Approach: Vision-first (uses cameras as primary) but with optional LiDAR and radar fusion for higher-autonomy products. More flexible than Tesla’s vision-only or Waymo’s full-fusion approaches.
  • EyeQ6 / EyeQ7: Current and next-generation chips supplied to OEMs. EyeQ6 delivers ~34 TOPS; EyeQ7 reportedly higher.
  • Roadmap: Mobileye Drive (Level 4 robotaxi platform), Mobileye SuperVision (consumer Level 2+ to Level 3), Mobileye Chauffeur (consumer Level 4).
  • Industry position: Powers Level 2 ADAS in roughly 100 million vehicles on the road today. Whether it can scale to consumer Level 4 in a way Tesla or Waymo haven’t will define its next decade.

What about Wayve, Comma.ai, Cruise, Zoox, Aurora?

CompanyApproach2026 status
Wayve (UK)End-to-end vision-based learning, similar in spirit to Tesla but originated independently in research labs. Bought into by Microsoft, Nvidia, SoftBank.Active; testing in London and other cities; partnering with major automakers.
Comma.ai (US)Open-source / retrofit; sells the openpilot software and the comma 3X aftermarket device. Vision-only.Active and growing community; supports 250+ vehicle models.
Cruise (GM subsidiary)Multi-sensor fusion + pre-mapping, similar to Waymo. SF and Phoenix robotaxi operations.Operations halted late 2023 after a pedestrian-injury incident; substantial scale-back. GM has reorganized the program multiple times.
Zoox (Amazon subsidiary)Custom-designed robotaxi vehicle (no steering wheel) with full sensor suite + pre-mapping.Public testing in SF and Las Vegas; commercial launch ongoing.
Aurora Innovation (NASDAQ: AUR)Trucking-first AV focus; LiDAR + radar + cameras + pre-mapping.Commercial trucking deployments active on Texas routes; publicly traded.
Pony.ai, WeRide, Baidu Apollo (China)Multi-sensor approaches; massive Chinese-market deployments.Robotaxi operations in multiple Chinese cities; rapid scaling.

The honest read in 2026: Waymo is in the operational lead on Level 4 robotaxi, Tesla is in the lead on consumer-vehicle deployment scale (Level 2 supervised), Mobileye is the lead ADAS supplier to the rest of the industry, and the Chinese players are scaling fast in their domestic market. Cruise’s setback in 2023 reset the competitive landscape; the field has consolidated around fewer well-funded players than in 2020.

What edge compute runs autonomous driving?

PlatformAI compute (TOPS)Used by
NVIDIA DRIVE Thor1,000+ TOPS — TOPS means Trillion Operations Per Second, a measure of how much AI computation a chip can run; bigger TOPS = bigger / smarter AI possible in the carNext-gen AV platforms; many OEM design wins
NVIDIA DRIVE Orin254 TOPSCurrent-gen AV platforms (Mercedes, Volvo, Polestar, others)
Tesla HW4 / AI4~250 TOPSTesla vehicles since 2023
Tesla HW3~144 TOPSTesla vehicles 2019–2023
Mobileye EyeQ6 High34 TOPSMany ADAS deployments
Qualcomm Snapdragon RideUp to 700+ TOPSBMW, GM, others
Custom in-house ASICsVariesWaymo, Cruise (historical), Zoox

The trend is unambiguous: AV compute has gone from ~10s of TOPS in 2018 to ~100s of TOPS in 2022 to ~1000s of TOPS in current designs. The compute available now is comparable to what was in datacenter GPUs five years ago — AV inference has eaten the curve.

What benchmarks does AV CV research use?

BenchmarkSourceWhat it measures
KITTIKarlsruhe Institute of Technology + Toyota Technological Institute2D and 3D object detection, depth, optical flow, odometry — the foundational AV benchmark
nuScenesMotional (now nuTonomy + Aptiv)Full multi-sensor benchmark with 1,000 scenes from Boston and Singapore
Waymo Open DatasetWaymoLargest open AV dataset; multi-city; multi-sensor
CityscapesCityscapes ConsortiumUrban semantic segmentation
Argoverse 2Argo AI (now part of Ford / VW)Detection, tracking, motion forecasting
Lyft Level 5Lyft (now part of Woven Planet)Motion forecasting on Lyft AV data

For comparison with drone benchmarks see our Computer Vision in Modern Drones post. The benchmarks overlap less than you might expect — aerial perspective and ground-vehicle perspective produce genuinely different problems, even when the underlying neural-network architecture is identical.

Why is this so much harder than drone CV?

  • Latency is critical. A car at 60 mph travels 88 feet per second. A perception-to-control loop that’s 100 ms too slow is the difference between “avoided the pedestrian” and “hit the pedestrian.” Drone autonomy has more time-budget per decision.
  • The safety bar is much higher. Drone crashes typically damage property and occasionally injure bystanders. AV crashes kill people. The acceptable error rate is several orders of magnitude lower.
  • Edge cases are endless. Children in costumes at Halloween. Couches on the highway. Reflective puddles. Construction signs upside-down. AV systems have to handle every weird-but-real situation, not just the common ones.
  • Sensor diversity is required. The safety bar is high enough that single-sensor failures cannot blind the system. This is why most non-Tesla AVs use camera + LiDAR + radar in combination.
  • The legal exposure is enormous. When an AV crash kills someone, liability flows to the manufacturer. Companies have to be defensible in court, not just in benchmarks.

Where this connects back to drones

The transfer between drone CV and AV CV is bidirectional:

  • Object-detection model families (YOLO, RF-DETR, ViT, ResNet) move freely between the two domains.
  • Edge-AI chips — NVIDIA Jetson Orin for drones, NVIDIA DRIVE Orin for cars — share substantial architecture and tooling.
  • Sensor-fusion algorithms developed for cars (camera + LiDAR fusion) are increasingly applied to higher-end industrial drones.
  • End-to-end neural-network approaches (Tesla’s FSD v12+ replacing 300k lines of rules with one network) have direct analogues in drone-autonomy research at Anduril, Shield AI, and elsewhere.
  • Sim-to-real transfer — training in simulation, deploying in the physical world — was pioneered in AV research and now powers drone autonomy too.

Watch one industry, you’re watching both. See our Computer Vision in Modern Drones post for the drone-side deep dive on the same underlying technology.

FAQ

Is Tesla’s FSD “Full Self-Driving” actually self-driving?

No, at least not at the SAE level the name suggests. Tesla’s product is officially labeled “Full Self-Driving (Supervised)” and is classified as SAE Level 2 — the driver must supervise at all times and remain ready to take over. The marketing name has been the subject of regulatory action by the California DMV and others.

Is Waymo profitable?

Not yet at the company level — Waymo continues to invest in expansion and the per-ride economics are still maturing. Per-city operations are reaching positive contribution margin in mature markets like Phoenix. Alphabet (Waymo’s parent) reports Waymo financials within the “Other Bets” segment of its 10-K filings.

Will Tesla’s vision-only approach work?

This is the single most-debated question in applied AI. Vision-only is provably sufficient at the human-driver safety level (humans drive with vision); whether neural networks trained on driving footage can match the human safety bar is genuinely open. Tesla’s FSD v13 has improved substantially from FSD v11 / v12. Whether it can reach Level 4 without sensor diversity remains to be seen.

What happened to Cruise?

Cruise (GM’s AV subsidiary) had a pedestrian-injury incident in San Francisco in October 2023 in which a Cruise vehicle dragged a pedestrian who had been struck by a different vehicle. The California DMV suspended Cruise’s permits; the company halted operations nationwide. GM has subsequently restructured the program multiple times and substantially scaled back ambitions. As of 2026 Cruise is no longer operating customer-facing rides.

What is Mobileye SuperVision and Chauffeur?

SuperVision is Mobileye’s Level 2+ to Level 3 consumer product (hands-off-eyes-on capability in defined conditions). Chauffeur is the Level 4 consumer product (hands-off-eyes-off in defined conditions). Both are designed for OEM customers (BMW, Audi, others) to integrate into their vehicles.

Are autonomous trucks here yet?

Beginning, yes. Aurora Innovation (NASDAQ: AUR) launched commercial driverless trucking operations on Texas routes in 2024–2025 with continued expansion. Kodiak Robotics, Plus, and Daimler-backed Torc are also active in the AV trucking category. The economics of autonomous freight (predictable routes, valuable cargo, paid-driver shortage) make it potentially easier-to-deploy than passenger AVs.

Can I read the original FSD or Waymo research papers?

Tesla publishes relatively little research publicly — their approach is mostly behind closed doors. Waymo has published a substantial research catalog at waymo.com/research. The Waymo Open Dataset (free to download) and associated challenges are the closest thing to an open look at industrial AV CV. arXiv (arxiv.org/list/cs.CV) hosts a large body of academic AV research from Waymo, Aurora, Wayve, and university labs.

The bottom line

Computer vision in autonomous vehicles uses the same underlying technology as computer vision in drones — same model families, same edge-AI chips, same training pipelines — but it operates under harder constraints (latency, safety bar, edge-case diversity, legal exposure) that have produced different philosophical bets on what the right approach is.

Tesla bets vision-only end-to-end neural network at fleet scale. Waymo bets full sensor fusion plus pre-mapping at geofenced scale. Mobileye bets flexible vision-first ADAS for the rest of the industry. Wayve and Comma.ai and the Chinese players occupy different points on the same spectrum. Whoever reaches Level 4–5 at scale first reshapes the global $10+ trillion vehicle industry.

For broader context: Computer Vision in Modern Drones (the drone-side companion to this post), Every AI Model Worth Knowing in 2026, What Is a Large Language Model?, Anduril Industries Explained (defense parallels to the same technology). Daily AI fundamentals in our free Beginners in AI newsletter.

Learn Our Proven AI Frameworks

Beginners in AI created 6 branded frameworks to help you master AI: STACK for prompting, BUILD for business, ADAPT for learning, THINK for decisions, CRAFT for content, and CRON for automation.

Get Smarter About AI Every Morning

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.

Free forever. Unsubscribe anytime.

Sources

You May Also Like

Two ways to go further

The AI Prompt Library

1,000+ ready-to-use prompts for Claude, ChatGPT, and Gemini. Stop staring at a blank box.

Get it for $39 →

2-Hour Live AI Crash Course

A private, beginner-friendly session across Claude, ChatGPT, Gemini, and the wider landscape.

Book for $125 →

Discover more from Beginners in AI

Subscribe now to keep reading and get access to the full archive.

Continue reading