
May 12, 2026, 8:26 a.m. ET | ⏱️10–12 minutes
By Ethan Carter
Over the past few years, a clear trend has emerged across the automotive industry: more carmakers are designing their own autonomous driving chips.
Tesla’s Full Self-Driving chip has already reached its fifth generation, with the latest AI5 platform reportedly entering mass production in mid-2025. Chinese automakers have accelerated similar efforts. NIO introduced its 5nm Shenji NX9031, Xpeng unveiled its Turing chip, and Li Auto announced the Mach M100. Companies including BYD, Geely, and Momenta have also been linked to custom silicon projects through public reporting and supply-chain disclosures.
At first glance, this may look like a straightforward attempt to reduce dependence on suppliers such as NVIDIA. But the deeper reason is technological rather than purely commercial.
The AI models used in autonomous driving are changing rapidly. Traditional perception architectures are gradually giving way to Transformer-based systems, diffusion models, and world models. As these model architectures evolve, the assumptions behind older chip designs are starting to break down. The industry is discovering that peak computing power alone no longer guarantees real-world performance.
Why Carmakers Want Their Own Chips
Developing an automotive-grade chip is not simply about lowering component costs.
In reality, building a custom chip is a long-term strategic decision. Industry estimates suggest that the full cycle — from defining requirements to deployment in production vehicles — can take three to six years, sometimes longer. That means chip architects must make decisions today based on what autonomous driving workloads may look like years into the future.
If that prediction is wrong, the chip risks becoming outdated before it reaches meaningful scale.
The Shift From Hardware Difficulty to Software Difficulty
Advanced chip development remains extremely expensive. Public semiconductor reports indicate that tape-out costs at advanced nodes such as 5nm can reach tens of millions of dollars before factoring in engineering resources, software development, and IP licensing fees.
Yet despite these costs, more automakers continue moving into custom silicon.
One reason is that the barrier to hardware design itself has gradually fallen. The semiconductor ecosystem now offers:
· mature IP blocks,
· better EDA automation tools,
· foundry support services,
· and specialized design firms.
As a result, the hardest problem is no longer necessarily physical chip implementation.
Instead, the real competitive challenge increasingly lies in:
· software toolchains,
· compilers,
· model optimization,
· long-term AI adaptability,
· and ecosystem integration.
These are precisely the areas where a general-purpose chip supplier may struggle to perfectly match the needs of a specific automaker.
For many companies, owning the chip roadmap also strengthens their technology branding and gives them tighter control over future AI deployment strategies.

Why TOPS Alone No Longer Explains Performance
For years, the autonomous driving industry relied heavily on a single number: TOPS, or trillions of operations per second.
Higher TOPS was often treated as a shorthand for a stronger autonomous driving system.
That assumption is becoming less reliable.
The Three Main AI Model Directions
Today’s autonomous driving models are splitting into several architectural paths, each with very different computing characteristics.
1. Modular End-to-End Systems
This remains the most common production approach today.
Perception, prediction, and planning are still separated into structured stages with relatively clear outputs. These systems are computationally manageable and easier to validate for safety certification.
2. Vision-Language-Action (VLA) Models
These systems combine:
· visual understanding,
· language reasoning,
· and action generation.
Many VLA systems use Mixture-of-Experts (MoE) architectures and can scale into tens of billions of parameters.
Their advantage is improved contextual understanding. The downside is dramatically higher demands on:
· memory bandwidth,
· memory capacity,
· and scheduling efficiency.
3. World Models + Diffusion Systems
World models attempt to simulate how the physical world evolves over time.
Rather than only recognizing current surroundings, they try to predict how scenes may change based on future actions.
This approach has generated significant excitement in research circles because it may eventually enable more generalized driving intelligence. However, large-scale commercial deployment remains limited.
The important point is this:
Different model architectures stress completely different parts of the hardware stack.
As a result, two chips with similar TOPS numbers may behave very differently in practice.

The Hidden Bottleneck: Memory Bandwidth
One of the biggest shifts in AI accelerator design is the growing importance of memory systems.
In many next-generation workloads, memory bandwidth matters as much as — or even more than — raw compute.
Why High TOPS Can Still Underperform
Traditional CNN-based workloads relied heavily on dense matrix multiplication. In that environment, adding more compute units often translated directly into better performance.
Modern autonomous driving models behave differently.
Diffusion-based architectures and Transformer systems involve:
· irregular memory access,
· dynamic workloads,
· sparse computation,
· temporal reasoning,
· and frequent data movement.
This creates a major engineering problem.
A chip may advertise extremely high theoretical TOPS, but if memory bandwidth cannot feed data into the compute units fast enough, much of that compute sits idle.
Industry engineers increasingly focus on factors such as:
· hierarchical memory design,
· task scheduling,
· cache efficiency,
· vector processing flexibility,
· and interconnect latency.
These characteristics often determine actual usable performance far more than marketing specifications.
Why Diffusion Models Are Especially Difficult
Diffusion-based inference creates additional complications.
These models require sufficiently large batch sizes to keep matrix multiplication hardware fully utilized. But autonomous driving systems operate under strict real-time latency constraints, meaning batch sizes are usually very small.
The result is lower compute utilization.
In practice, a chip’s effective throughput may fall far below its advertised theoretical peak.
This is one reason why some automakers are exploring custom architectures optimized specifically for future AI model behavior rather than current benchmark scores.

Three Competing AI Chip Design Philosophies
As workloads become more diverse, chipmakers are increasingly splitting into three broad architectural approaches.
1. The “Big Core” Strategy
This approach prioritizes maximum compute efficiency.
Large matrix multiplication arrays deliver excellent performance for dense, structured workloads and can achieve very strong energy efficiency under ideal conditions.
However, these architectures are often rigid.
If workloads become sparse or dynamic, utilization can collapse quickly. Extracting peak performance usually requires highly sophisticated compiler optimization and large software teams.
2. The “Small Core” Strategy
This design philosophy focuses on flexibility.
Instead of relying on a few giant compute arrays, it distributes work across many smaller processing elements.
Advantages include:
· better low-batch utilization,
· improved support for sparse workloads,
· and more flexible scheduling.
The downside is cost.
Small-core designs generally consume more silicon area and may become significantly more expensive at equivalent nominal compute levels.
3. The “Medium Core” Balance
A growing number of companies appear to favor a middle-ground approach.
These chips combine:
· medium-sized matrix units,
· scalar processing elements,
· and more adaptive scheduling systems.
Rather than optimizing for one extreme metric, they aim for a balance between:
· efficiency,
· programmability,
· flexibility,
· and ecosystem compatibility.
This balanced approach may be especially attractive for automotive applications, where power consumption, thermal limits, reliability, and cost all matter simultaneously.
The Industry Is Shifting From Peak Performance to Real Efficiency
The autonomous driving chip market is becoming structurally more diverse.
Established global suppliers still hold major advantages through mature ecosystems, software stacks, and developer tools. Meanwhile, Chinese semiconductor firms continue expanding in automotive AI through aggressive cost-performance positioning and domestic supply-chain integration.
At the same time, automakers themselves are gradually becoming chip developers.
The competitive focus is no longer simply about who can claim the highest TOPS number.
The real competition increasingly centers on:
· real-world compute efficiency,
· memory architecture,
· software-hardware co-optimization,
· thermal efficiency,
· and total system cost.
This shift may ultimately benefit consumers.
If automakers and chip vendors can improve practical efficiency rather than merely scaling raw compute, advanced driver-assistance systems could become more affordable and accessible across a wider range of vehicles.
Conclusion
Autonomous driving chips are entering a new phase.
The earlier era focused heavily on peak compute numbers and headline specifications. But as AI models become more complex and heterogeneous, the industry is discovering that efficiency, flexibility, and software integration matter just as much as raw silicon power.
The next winners may not necessarily be the companies with the largest TOPS figures.
Instead, success will likely depend on who can best predict the future direction of autonomous driving AI — and design chips that remain useful when those future models finally arrive.
References
1.NVIDIA Automotive Platform
2.AMD Adaptive and Embedded Computing
3.TSMC Automotive Semiconductor Solutions
4.Qualcomm Snapdragon Digital Chassis
5.Arm Automotive Compute Platform
About the Author
Ethan Carter focuses on AI chips, semiconductor technology, and computing infrastructure. His work covers GPUs, AI accelerators, edge AI processors, and the hardware systems that power modern artificial intelligence. He writes analytical articles that connect technical developments with industry trends and practical applications.
Editor’s Note
This article is based on publicly available industry reports, company disclosures, semiconductor engineering discussions, and market analysis available as of May 2026. Some technical directions discussed — especially world models and diffusion-based autonomous driving systems — remain under active research and may evolve significantly over time.
Disclaimer
This article is intended for informational and educational purposes only and does not constitute investment advice, engineering certification, or commercial recommendations. Technical trends, performance estimates, and market projections discussed here are based on publicly available information and industry analysis that may change over time. Autonomous driving technologies remain subject to regulatory, engineering, and safety uncertainties.
Recommend:
From Pattern Recognition to World Simulation: How the SIMI System Builds “Physical Common Sense” for AI
Next-Gen AI Finally Understands the Physical World – A 'Silent Tsunami' Reshaping Autonomous Driving, Architecture, and Gaming
China Approves 6G Trial Spectrum: What the 6425–7125 MHz Band Means
The "Second Battlefield" of AI Chips: Why Edge Computing Has Become a Must-Win for Tech Giants