NewsTechnical

This Nvidia Challenger Says Its AI Chip Is 10x Faster Than A GPU

CNBC

d-Matrix, a California-based chip startup, has announced its Corsair AI inference chip is now in full production, claiming it is 10x faster at token generation and 3x cheaper than standalone GPUs. The chip uses on-chip SRAM instead of high-bandwidth memory to bypass the memory bottleneck limiting GPU performance. The company has secured $275 million in funding and partnerships with major players including Microsoft, Arista, Broadcom, and Supermicro.

Summary

d-Matrix, founded in 2019 by Sid Sheth, has entered full production with its Corsair AI inference chip, manufactured at TSMC in Taiwan on a six-nanometer node. The chip is designed specifically for AI inference workloads and is being positioned as a direct challenger to Nvidia's GPU dominance in that segment. Corsair is a card-based product housing four chips that can be plugged into a standard server, making it relatively easy to integrate into existing data center infrastructure.

The core technical differentiation of the Corsair chip lies in its memory architecture. Rather than relying on external high-bandwidth DRAM — which is in short supply from manufacturers like SK Hynix, Samsung, and Micron — d-Matrix integrates SRAM directly onto the chip. This eliminates the memory bottleneck that the company argues plagues GPUs and other accelerators, enabling faster and more energy-efficient data access. d-Matrix claims its solution produces tokens ten times faster than GPUs alone, costs three times less, and is up to five times more energy efficient, making it particularly well-suited for inference tasks like chatbots, video generation, and agentic AI.

d-Matrix's approach mirrors that of two other memory-first chip companies that have seen significant success: Cerebras, which achieved a near-$100 billion market cap IPO in May, and Groq, whose assets were acquired by Nvidia for $20 billion. Nvidia subsequently unveiled a new line of chips based on Groq's technology, called Language Processing Units (LPUs), at its GTC conference in March. This validates the SRAM-on-chip approach that d-Matrix is also pursuing.

On the business side, d-Matrix has not yet disclosed specific customers but says it has commitments from hyperscalers, neoclouds, and leading AI labs. The company raised $275 million in November at a $2 billion valuation, with Microsoft investing through its M12 venture arm. d-Matrix has also formed a partnership with Arista, Broadcom, and Supermicro to deliver a full rack-scale deployment system for AI data centers. The company is also exploring the possibility of selling chips into China, noting that its inference-only design does not enable model training, which could help it navigate U.S. export control restrictions.

Key Insights

  • d-Matrix's CEO Sid Sheth claims Corsair solves a memory bottleneck that GPUs and Amazon's Trainium chips cannot, because it does not rely on DRAM and instead uses SRAM integrated directly onto the chip to accelerate memory access.
  • d-Matrix claims its Corsair chip, when paired with GPUs in a server rack, produces tokens ten times faster than GPUs alone, while also being three times cheaper and up to five times more energy efficient for inference workloads.
  • Nvidia acquired Groq's assets for $20 billion — its largest acquisition to date — and then unveiled a new line of Language Processing Units (LPUs) based on Groq's SRAM-on-chip technology at GTC in March, validating the memory-first inference chip approach.
  • d-Matrix's CEO suggests the company could eventually ship chips to China, arguing that because Corsair is inference-only and cannot be used to train AI models, it may be positioned to navigate U.S. export control restrictions.
  • Microsoft invested in d-Matrix through its M12 venture arm as part of a $275 million funding round that valued the company at $2 billion, which is notable given Microsoft's parallel development of its own Maya inference chips and other in-house silicon.

Topics

d-Matrix Corsair AI inference chipSRAM vs. high-bandwidth DRAM memory architectureNvidia competition and the inference chip marketAI chip funding and industry partnershipsChina market opportunity for inference chips

Full transcript available for MurmurCast members

Sign Up to Access

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.