NEWSLETTER

By clicking submit, you agree to share your email address with TFN to receive marketing, updates, and other emails from the site owner. Use the unsubscribe link in the emails to opt out at any time.

After $230M raise, Positron becomes unicorn to target Nvidia’s Rubin in inference race

Positron
Image credits: Positron

Reno-based Positron AI has raised $230 million in a Series B round, valuing the energy-efficient AI inference hardware company at more than $1 billion.

The oversubscribed round was co-led by ARENA Private Wealth, Jump Trading, and Unless, with strategic backing from Qatar Investment Authority, Arm, and Helena. Existing investors, including Valor Equity Partners, Atreides Management, DFJ Growth, and 1517, also joined the round.

The new funding will accelerate development of Asimov, Positron’s next-generation custom silicon, with tape-out planned for late 2026 and production expected in early 2027. Asimov is designed to support up to two terabytes of memory per accelerator and significantly larger memory capacity at the system and rack level.

Positron expects strong revenue growth in 2026 and says it is on track to become one of the fastest-growing silicon companies, achieving large-scale commercial traction in about 2.5 years from launch.

Tackling AI’s growing energy problem

Led by Mitesh Agrawal, Positron AI is focused on one of the biggest challenges facing AI today: the rising cost and power demand of running large models.

While most attention has gone to training AI systems, inference, the process of running models in real-world applications, is quickly becoming a major bottleneck for energy use and infrastructure.

“We’re grateful for this investor enthusiasm, which itself is a reflection of what the market is demanding,” said Mitesh Agrawal, CEO of Positron AI.

“Energy availability has emerged as a key bottleneck for AI deployment. And our next-generation chip will deliver 5x more tokens per watt in our core workloads versus Nvidia’s upcoming Rubin GPU. Memory is the other giant bottleneck in inference, and our next-generation Asimov custom silicon will ship with over 2304 GB of RAM per device next year, versus just 384 GB for Rubin.”

From Atlas systems to Asimov silicon

Positron is building the infrastructure layer that makes AI usable at scale by lowering the cost and power required to run modern models.

The company is already shipping its current product, dubbed Atlas, an inference system designed for fast deployment and scaling. The company says Atlas is fully fabricated and manufactured in the US, allowing customers to ramp capacity quickly with a dependable supply chain.

“Memory bandwidth and capacity are two of the key limiters for scaling AI inference workloads for next-generation models,” said Dylan Patel, founder and CEO of SemiAnalysis. “Positron is taking a unique approach to the memory scaling problem, and with its next-generation Asimov chip, can deliver more than an order of magnitude greater high-speed memory capacity per chip than incumbent or upstart silicon providers.”

Positron’s roadmap includes not just Asimov but also Titan, its next-generation system designed for memory-intensive AI workloads such as long-context language models, video, and agent-based systems.

 “To us, development speed is an essential competitive advantage,” said Agrawal. “Competing with Nvidia means matching their shipping frequency, and we have designed our organisation around that goal.”

Positron is building this platform with an ecosystem of industry leaders, including Arm, Supermicro and other key technology and supply-chain partners.

“As AI inference scales, efficiency and system design matter more than raw benchmarks,” said Eddie Ramirez, Vice President of Go-to-Market, Cloud AI Business Unit at Arm. “Positron’s memory-centric approach, built on Arm technology, reflects how tightly coupled systems and a broad ecosystem come together to deliver scalable, performance-per-watt gains in next-generation AI infrastructure.”

“Positron is solving one of the most important bottlenecks in AI: delivering inference at scale within real-world power and cost constraints,” said Ari Schottenstein. “The combination of shipping traction today with Atlas, plus a credible path to Asimov, creates a rare opportunity to define a new category in AI infrastructure.”

“For the workloads we care about, the bottlenecks are increasingly memory and power—not theoretical compute,” said Alex Davies, Chief Technology Officer of Jump Trading. “In our testing, Positron Atlas delivered roughly 3x lower end-to-end latency than a comparable H100-based system on the inference workloads we evaluated, in an air-cooled, production-ready footprint with a supply chain we can plan around.”

Total
0
Shares
Related Posts
Total
0
Share

Get daily funding news briefings in the tech world delivered right to your inbox.

Enter Your Email
join our newsletter. thank you
TFN Banner