TL;DR: Self-driving cars run the same underlying computer-vision technology as autonomous drones — YOLO-class detectors, vision transformers, NVIDIA edge-AI chips — in a different operating context. The major divide is philosophical: Tesla uses a vision-only approach (cameras only, no LiDAR, end-to-end neural network); Waymo uses sensor fusion (29 cameras + multiple LiDARs + radar + centimeter-accurate HD maps). Mobileye, Wayve, Comma.ai, Cruise (now shutdown), Aurora, and Zoox occupy positions between those poles. The market is still being shaped by which approach actually reaches Level 4–5 autonomy at scale.
Why read: The vision-only vs sensor-fusion debate is the single most-important open question in applied AI. Whoever wins reshapes a $10+ trillion category.
Best for: Anyone tracking Tesla, Waymo, or the broader autonomous-vehicle industry; AI engineers connecting CV research to operational deployment; investors evaluating the AV space.
Skip if: You only care about hobbyist drones. Daily AI fundamentals in our free Beginners in AI newsletter.
The autonomous-vehicle industry is the largest deployment of computer vision in the physical world. Every Tesla running Full Self-Driving, every Waymo One robotaxi in Phoenix, San Francisco, Los Angeles, and Austin, every Mobileye-equipped EyeQ-powered car — all of them are running the same family of CV models we covered in our drone computer-vision post. The model architectures and the edge-compute hardware overlap heavily. The operating context, the safety stakes, and the sensor philosophies do not.
Here’s how the major AV approaches actually work in 2026.
What are the SAE levels of autonomy?
| SAE Level | What the human does | What the car does | 2026 examples |
|---|---|---|---|
| Level 0 | Drives the whole time | Warnings only | Older cars without ADAS |
| Level 1 | Drives, supervises one assist | One automated function at a time (cruise OR lane-keep) | Most pre-2020 cars with cruise control |
| Level 2 | Drives, supervises multiple assists | Multiple automated functions simultaneously (cruise + lane-keep) | Tesla Autopilot, Tesla FSD (Supervised), GM Super Cruise, Ford BlueCruise |
| Level 3 | Hands off in defined conditions; takes over when prompted | Drives in defined ODD (operational design domain), requests handover when limits reached | Mercedes Drive Pilot (highway, <40 mph, Germany / California / Nevada) |
| Level 4 | Nothing in the ODD | Drives without human in defined ODD; pulls over safely outside it | Waymo One (geofenced robotaxi in Phoenix, SF, LA, Austin) |
| Level 5 | Nothing, ever | Drives anywhere a human could | Does not yet exist |
The levels are defined by SAE International standard J3016. The marketing language companies use for their products doesn’t always match the SAE level — Tesla’s “Full Self-Driving (Supervised)” is SAE Level 2 despite the name; Waymo’s service in Phoenix is genuine SAE Level 4 within its geofence; Mercedes Drive Pilot is the only consumer-buyable SAE Level 3 system in the US as of 2026.
Tesla’s vision-only approach
Tesla’s bet, articulated repeatedly by CEO Elon Musk: humans drive with two eyes, so AI should be able to drive with cameras. Tesla’s vehicles use eight cameras as the primary perception, no LiDAR, and (since the “Tesla Vision” transition completed in 2023) no radar. The full sensor suite is cameras plus ultrasonic sensors plus inertial measurement (and GPS for navigation only, not perception).
- Hardware: “Hardware 4” (HW4) / “AI4” computers in current Tesla vehicles. The custom AI inference chip (Tesla’s own design, built on Samsung 7nm process) reportedly delivers ~250 TOPS — Trillion Operations Per Second, the standard unit for measuring how much AI math a chip can do, with bigger TOPS meaning bigger / smarter AI possible in the vehicle — of AI compute.
- FSD v13: Released December 2024. First version that uses native camera resolution end-to-end. Available only on HW4-equipped vehicles. Has continued to update through 2025–2026.
- Neural-network approach: Tesla replaced approximately 300,000 lines of hand-written C++ rules with a single end-to-end neural network that takes camera frames in and produces driving controls out. The training data is the behavior of 6+ million Tesla drivers.
- Operational status: SAE Level 2 (driver supervised). Tesla has stated intent to move toward Level 4–5 but has not yet shipped a system that meets those standards.
The Tesla argument: HD maps and LiDAR don’t scale — you can’t map every road in the world precisely enough, and LiDAR is expensive and degrades. If a vision-only approach works, it works everywhere. The Tesla risk: vision-only has not yet been proven to reach the safety-margin required for Level 4 deployment.
Waymo’s multi-sensor fusion approach
Waymo (originally Google’s self-driving project, spun out as a subsidiary of Alphabet) takes the opposite philosophy: use every sensor available, pre-map the operational area precisely, and only operate where the system has been validated to safety standards.
- Sensor suite: 29 cameras + multiple LiDARs (creating precise 3D point clouds at multiple ranges) + 6 radar units + centimeter-accurate HD maps + GPS + IMU.
- Sensor fusion: The system matches LiDAR measurements to corresponding camera pixels, locating each object in three-dimensional space with high precision. Redundant sensing means the failure of any single sensor doesn’t blind the system.
- HD maps: Every road Waymo operates on has been pre-mapped at centimeter accuracy — not just lane positions, but signs, signals, road geometry, drivable surfaces, drop-off zones.
- Geofenced deployment: Waymo One robotaxi service operates only in mapped areas: Phoenix, San Francisco, Los Angeles, Austin (as of 2026) with expansion to additional cities ongoing.
- Operational status: Genuine SAE Level 4 within the geofence. No safety driver in customer-facing rides as of 2024+.
The Waymo argument: full-stack autonomy is too hard for vision alone; redundant sensors plus pre-mapping is the only path to verifiable safety. The Waymo risk: the geofence-and-pre-map model requires huge per-city deployment cost and scales slowly.
How does Tesla’s and Waymo’s technology compare?
| Dimension | Tesla FSD | Waymo Driver |
|---|---|---|
| Cameras | 8 cameras (vision-only) | 29 cameras (multi-spectrum) |
| LiDAR | None | Multiple LiDAR units |
| Radar | None (removed in 2021–2023) | 6 radar units |
| HD maps | None — uses commercial nav maps only | Centimeter-accurate pre-mapping required |
| AI compute onboard | HW4 / AI4 (~250 TOPS Tesla custom) | NVIDIA DRIVE-class compute + custom silicon |
| SAE Level | Level 2 (Supervised) | Level 4 (within geofence) |
| Deployment footprint | ~6M+ vehicles worldwide | ~700 robotaxis in 4 US metros |
| Scaling model | Software update to existing fleet | City-by-city mapping + fleet deployment |
| Backing | Tesla (NASDAQ: TSLA) | Alphabet (Waymo subsidiary) |
What is Mobileye?
Mobileye is an Israeli AV-technology company majority-owned by Intel (with a 2022 partial IPO). It supplies its EyeQ system-on-chip and software stack to dozens of automakers including BMW, Audi, Volkswagen, Ford, GM, Honda, Hyundai, and Nissan. Mobileye is the largest supplier of ADAS (Advanced Driver Assistance Systems) chips in the world.
- Approach: Vision-first (uses cameras as primary) but with optional LiDAR and radar fusion for higher-autonomy products. More flexible than Tesla’s vision-only or Waymo’s full-fusion approaches.
- EyeQ6 / EyeQ7: Current and next-generation chips supplied to OEMs. EyeQ6 delivers ~34 TOPS; EyeQ7 reportedly higher.
- Roadmap: Mobileye Drive (Level 4 robotaxi platform), Mobileye SuperVision (consumer Level 2+ to Level 3), Mobileye Chauffeur (consumer Level 4).
- Industry position: Powers Level 2 ADAS in roughly 100 million vehicles on the road today. Whether it can scale to consumer Level 4 in a way Tesla or Waymo haven’t will define its next decade.
What about Wayve, Comma.ai, Cruise, Zoox, Aurora?
| Company | Approach | 2026 status |
|---|---|---|
| Wayve (UK) | End-to-end vision-based learning, similar in spirit to Tesla but originated independently in research labs. Bought into by Microsoft, Nvidia, SoftBank. | Active; testing in London and other cities; partnering with major automakers. |
| Comma.ai (US) | Open-source / retrofit; sells the openpilot software and the comma 3X aftermarket device. Vision-only. | Active and growing community; supports 250+ vehicle models. |
| Cruise (GM subsidiary) | Multi-sensor fusion + pre-mapping, similar to Waymo. SF and Phoenix robotaxi operations. | Operations halted late 2023 after a pedestrian-injury incident; substantial scale-back. GM has reorganized the program multiple times. |
| Zoox (Amazon subsidiary) | Custom-designed robotaxi vehicle (no steering wheel) with full sensor suite + pre-mapping. | Public testing in SF and Las Vegas; commercial launch ongoing. |
| Aurora Innovation (NASDAQ: AUR) | Trucking-first AV focus; LiDAR + radar + cameras + pre-mapping. | Commercial trucking deployments active on Texas routes; publicly traded. |
| Pony.ai, WeRide, Baidu Apollo (China) | Multi-sensor approaches; massive Chinese-market deployments. | Robotaxi operations in multiple Chinese cities; rapid scaling. |
The honest read in 2026: Waymo is in the operational lead on Level 4 robotaxi, Tesla is in the lead on consumer-vehicle deployment scale (Level 2 supervised), Mobileye is the lead ADAS supplier to the rest of the industry, and the Chinese players are scaling fast in their domestic market. Cruise’s setback in 2023 reset the competitive landscape; the field has consolidated around fewer well-funded players than in 2020.
What edge compute runs autonomous driving?
| Platform | AI compute (TOPS) | Used by |
|---|---|---|
| NVIDIA DRIVE Thor | 1,000+ TOPS — TOPS means Trillion Operations Per Second, a measure of how much AI computation a chip can run; bigger TOPS = bigger / smarter AI possible in the car | Next-gen AV platforms; many OEM design wins |
| NVIDIA DRIVE Orin | 254 TOPS | Current-gen AV platforms (Mercedes, Volvo, Polestar, others) |
| Tesla HW4 / AI4 | ~250 TOPS | Tesla vehicles since 2023 |
| Tesla HW3 | ~144 TOPS | Tesla vehicles 2019–2023 |
| Mobileye EyeQ6 High | 34 TOPS | Many ADAS deployments |
| Qualcomm Snapdragon Ride | Up to 700+ TOPS | BMW, GM, others |
| Custom in-house ASICs | Varies | Waymo, Cruise (historical), Zoox |
The trend is unambiguous: AV compute has gone from ~10s of TOPS in 2018 to ~100s of TOPS in 2022 to ~1000s of TOPS in current designs. The compute available now is comparable to what was in datacenter GPUs five years ago — AV inference has eaten the curve.
What benchmarks does AV CV research use?
| Benchmark | Source | What it measures |
|---|---|---|
| KITTI | Karlsruhe Institute of Technology + Toyota Technological Institute | 2D and 3D object detection, depth, optical flow, odometry — the foundational AV benchmark |
| nuScenes | Motional (now nuTonomy + Aptiv) | Full multi-sensor benchmark with 1,000 scenes from Boston and Singapore |
| Waymo Open Dataset | Waymo | Largest open AV dataset; multi-city; multi-sensor |
| Cityscapes | Cityscapes Consortium | Urban semantic segmentation |
| Argoverse 2 | Argo AI (now part of Ford / VW) | Detection, tracking, motion forecasting |
| Lyft Level 5 | Lyft (now part of Woven Planet) | Motion forecasting on Lyft AV data |
For comparison with drone benchmarks see our Computer Vision in Modern Drones post. The benchmarks overlap less than you might expect — aerial perspective and ground-vehicle perspective produce genuinely different problems, even when the underlying neural-network architecture is identical.
Why is this so much harder than drone CV?
- Latency is critical. A car at 60 mph travels 88 feet per second. A perception-to-control loop that’s 100 ms too slow is the difference between “avoided the pedestrian” and “hit the pedestrian.” Drone autonomy has more time-budget per decision.
- The safety bar is much higher. Drone crashes typically damage property and occasionally injure bystanders. AV crashes kill people. The acceptable error rate is several orders of magnitude lower.
- Edge cases are endless. Children in costumes at Halloween. Couches on the highway. Reflective puddles. Construction signs upside-down. AV systems have to handle every weird-but-real situation, not just the common ones.
- Sensor diversity is required. The safety bar is high enough that single-sensor failures cannot blind the system. This is why most non-Tesla AVs use camera + LiDAR + radar in combination.
- The legal exposure is enormous. When an AV crash kills someone, liability flows to the manufacturer. Companies have to be defensible in court, not just in benchmarks.
Where this connects back to drones
The transfer between drone CV and AV CV is bidirectional:
- Object-detection model families (YOLO, RF-DETR, ViT, ResNet) move freely between the two domains.
- Edge-AI chips — NVIDIA Jetson Orin for drones, NVIDIA DRIVE Orin for cars — share substantial architecture and tooling.
- Sensor-fusion algorithms developed for cars (camera + LiDAR fusion) are increasingly applied to higher-end industrial drones.
- End-to-end neural-network approaches (Tesla’s FSD v12+ replacing 300k lines of rules with one network) have direct analogues in drone-autonomy research at Anduril, Shield AI, and elsewhere.
- Sim-to-real transfer — training in simulation, deploying in the physical world — was pioneered in AV research and now powers drone autonomy too.
Watch one industry, you’re watching both. See our Computer Vision in Modern Drones post for the drone-side deep dive on the same underlying technology.
FAQ
Is Tesla’s FSD “Full Self-Driving” actually self-driving?
No, at least not at the SAE level the name suggests. Tesla’s product is officially labeled “Full Self-Driving (Supervised)” and is classified as SAE Level 2 — the driver must supervise at all times and remain ready to take over. The marketing name has been the subject of regulatory action by the California DMV and others.
Is Waymo profitable?
Not yet at the company level — Waymo continues to invest in expansion and the per-ride economics are still maturing. Per-city operations are reaching positive contribution margin in mature markets like Phoenix. Alphabet (Waymo’s parent) reports Waymo financials within the “Other Bets” segment of its 10-K filings.
Will Tesla’s vision-only approach work?
This is the single most-debated question in applied AI. Vision-only is provably sufficient at the human-driver safety level (humans drive with vision); whether neural networks trained on driving footage can match the human safety bar is genuinely open. Tesla’s FSD v13 has improved substantially from FSD v11 / v12. Whether it can reach Level 4 without sensor diversity remains to be seen.
What happened to Cruise?
Cruise (GM’s AV subsidiary) had a pedestrian-injury incident in San Francisco in October 2023 in which a Cruise vehicle dragged a pedestrian who had been struck by a different vehicle. The California DMV suspended Cruise’s permits; the company halted operations nationwide. GM has subsequently restructured the program multiple times and substantially scaled back ambitions. As of 2026 Cruise is no longer operating customer-facing rides.
What is Mobileye SuperVision and Chauffeur?
SuperVision is Mobileye’s Level 2+ to Level 3 consumer product (hands-off-eyes-on capability in defined conditions). Chauffeur is the Level 4 consumer product (hands-off-eyes-off in defined conditions). Both are designed for OEM customers (BMW, Audi, others) to integrate into their vehicles.
Are autonomous trucks here yet?
Beginning, yes. Aurora Innovation (NASDAQ: AUR) launched commercial driverless trucking operations on Texas routes in 2024–2025 with continued expansion. Kodiak Robotics, Plus, and Daimler-backed Torc are also active in the AV trucking category. The economics of autonomous freight (predictable routes, valuable cargo, paid-driver shortage) make it potentially easier-to-deploy than passenger AVs.
Can I read the original FSD or Waymo research papers?
Tesla publishes relatively little research publicly — their approach is mostly behind closed doors. Waymo has published a substantial research catalog at waymo.com/research. The Waymo Open Dataset (free to download) and associated challenges are the closest thing to an open look at industrial AV CV. arXiv (arxiv.org/list/cs.CV) hosts a large body of academic AV research from Waymo, Aurora, Wayve, and university labs.
The bottom line
Computer vision in autonomous vehicles uses the same underlying technology as computer vision in drones — same model families, same edge-AI chips, same training pipelines — but it operates under harder constraints (latency, safety bar, edge-case diversity, legal exposure) that have produced different philosophical bets on what the right approach is.
Tesla bets vision-only end-to-end neural network at fleet scale. Waymo bets full sensor fusion plus pre-mapping at geofenced scale. Mobileye bets flexible vision-first ADAS for the rest of the industry. Wayve and Comma.ai and the Chinese players occupy different points on the same spectrum. Whoever reaches Level 4–5 at scale first reshapes the global $10+ trillion vehicle industry.
For broader context: Computer Vision in Modern Drones (the drone-side companion to this post), Every AI Model Worth Knowing in 2026, What Is a Large Language Model?, Anduril Industries Explained (defense parallels to the same technology). Daily AI fundamentals in our free Beginners in AI newsletter.
Learn Our Proven AI Frameworks
Beginners in AI created 6 branded frameworks to help you master AI: STACK for prompting, BUILD for business, ADAPT for learning, THINK for decisions, CRAFT for content, and CRON for automation.
Get Smarter About AI Every Morning
Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.
Free forever. Unsubscribe anytime.
Sources
- SAE International, SAE J3016 Levels of Driving Automation — the primary industry-standard definition of Levels 0–5.
- Tesla, Full Self-Driving (Supervised) Support page — primary reference for FSD versions, hardware requirements, and capability descriptions.
- Waymo, Waymo Research — primary source for Waymo’s technology papers, sensor approach, and operational data.
- Waymo Open Dataset, waymo.com/open — the largest open AV dataset, free to download.
- NHTSA, Automated Driving Systems — primary US federal regulatory reference.
- NVIDIA, DRIVE platform overview — primary reference for DRIVE Thor and DRIVE Orin AI compute platforms.
- Mobileye Investor Relations, ir.mobileye.com — primary source for Mobileye product line, EyeQ chip specifications, and OEM partnerships.
- Wayve, wayve.ai — UK end-to-end AV research company; published papers on neural-network-based driving.
- Comma.ai openpilot, github.com/commaai/openpilot — open-source self-driving software; primary reference for the retrofit / open approach.
- Aurora Innovation SEC filings (AUR) — aurora.tech/investor-relations — primary reference for the trucking-AV approach.
- nuScenes, nuscenes.org — the multi-sensor AV benchmark.
- KITTI Vision Benchmark Suite, cvlibs.net/datasets/kitti — foundational AV benchmark.
- Argoverse 2, argoverse.org/av2 — modern AV motion-forecasting benchmark.
- California DMV, Autonomous Vehicles program — primary source for California testing permits and disengagement reports.
- arXiv computer-vision section, arxiv.org/list/cs.CV — primary source for autonomous-driving research.
You May Also Like
- Computer Vision in Modern Drones
- Waymo Explained: The $126B Robotaxi Operational Leader
- Tesla Full Self-Driving (FSD) Explained
- Mobileye Explained: The ADAS Chip Supplier (NASDAQ: MBLY)
- Aurora Innovation: The Autonomous-Trucking Leader (NASDAQ: AUR)
- Wayve Explained: The $8.6B London AV Company
- History of Self-Driving Cars
- Every AI Model Worth Knowing in 2026
- What Is a Large Language Model?
- Anduril Industries Explained
- Shield AI Explained
Two ways to go further
The AI Prompt Library
1,000+ ready-to-use prompts for Claude, ChatGPT, and Gemini. Stop staring at a blank box.
Get it for $39 →2-Hour Live AI Crash Course
A private, beginner-friendly session across Claude, ChatGPT, Gemini, and the wider landscape.
Book for $125 →