After the news about Etched’s funding of $500 million, Cerebras Systems, an AI chipmaker, is in talks to raise nearly $10 billion in fresh capital, to compete with Nvidia. This round is claimed to value the company at around $22 billion.
Previously, the company picked up a $1.1 billion Series G funding round at an $8.1 billion valuation.
A wafer-sized bet on inference
Andrew Feldman and Gary Lauterbach, Cerebras’ founders, bring years of experience in high-performance computing and chip design, having previously worked together on SeaMicro before its acquisition by AMD.
At the heart of Cerebras’ strategy is a radical departure from how most AI chips are built. Instead of stitching together many smaller processors, the company uses a wafer-scale engine that places an entire AI model on one massive chip. This design reduces data movement, increases memory bandwidth, and cuts energy waste, factors that matter most during inference, when models are deployed at scale.
Industry analysis suggests this approach delivers higher performance density than conventional GPUs or AI accelerators in large language model inference. Cerebras has focused squarely on this blue ocean segment, where demand is exploding and cost efficiency is becoming just as important as raw power.
Performance claims that reshape the cost equation
Cerebras’ latest CS-3 system, powered by its WSE-3 chip, has made waves with public benchmarks. In inference tasks such as running Llama 3 70B, the system has been shown to outperform NVIDIA’s Blackwell B200-based setups by a wide margin, while consuming less power overall. These gains translate into lower total costs for customers running large models continuously.
The company doesn’t just sell hardware. It also offers remote AI computing services, with customers including Meta Platforms Inc., IBM, and Mistral AI. This hybrid model allows enterprises to access high-performance inference without building their own infrastructure, widening Cerebras’ reach beyond data centre buyers.
Momentum from demand
AI inference workloads are growing at a blistering pace, roughly doubling every six months. This surge has drawn attention from every major player. To be specific, Google is promoting its latest TPUs as inference-first accelerators, while reports suggest OpenAI is turning to cloud-based TPUs to rein in operating costs.
Under CEO Andrew Feldman, Cerebras is positioning itself as a focused challenger rather than a generalist. While NVIDIA still leads in training, general-purpose computing, and ecosystem depth, inference is becoming a battleground of its own. Whether Cerebras’ wafer-scale vision can sustain its momentum will shape how competitive and how costly the AI inference market becomes in the years ahead.