Nemotron 3: NVIDIA's new open-source AI model explained | Jensen Huang and Lex Fridman
Jensen Huang explains NVIDIA's vision for open-source AI, discussing the release of Nematron 3, a 120 billion parameter model that combines transformer and SSM architectures. He outlines three key reasons for NVIDIA's open-source strategy: co-design research, democratizing AI access, and enabling AI development across diverse industries and modalities.
Summary
In this conversation, Jensen Huang discusses NVIDIA's approach to open-source AI, specifically highlighting the release of Nematron 3, a 120 billion parameter model available through platforms like Perplexity. Huang presents three fundamental reasons for NVIDIA's commitment to open-source AI development. First, he emphasizes that building AI models helps NVIDIA understand how AI is evolving, which is crucial for their role as an AI computing company. He notes that Nematron 3 isn't just a pure transformer model but combines transformer and SSM architectures, drawing on NVIDIA's early research in conditional GANs and progressive GANs that led to diffusion models. This research gives them visibility into what computing systems will be needed for future AI models as part of their 'extreme co-design strategy.' Second, Huang acknowledges the balance between proprietary world-class models as products and the need for AI to diffuse into every industry, country, and research institution. He argues that if everything remains proprietary, it becomes difficult for researchers and innovators to build upon existing work, making open source fundamentally necessary for widespread AI adoption. Third, he recognizes that AI extends far beyond language and will likely incorporate tools, models, and sub-agents trained on diverse modalities including biology, chemistry, physics, and thermodynamics. NVIDIA's role is to ensure that specialized AI applications in weather prediction, biology, physical AI, and other domains can be pushed to their limits. While NVIDIA doesn't build cars or discover drugs directly, they want to ensure that car companies and pharmaceutical companies like Lily have access to the best AI systems for their specific needs. Huang concludes by noting NVIDIA's comprehensive approach to open sourcing, which includes not just the models and weights, but also the data and methodology used to create them.
Key Insights
- Jensen Huang explains that Nematron 3 is not a pure transformer model but combines transformer and SSM architectures, building on NVIDIA's early research in conditional GANs and progressive GANs
- Huang argues that building AI models is essential for NVIDIA as an AI computing company because it gives them visibility into what kind of computing systems will be needed for future models as part of their extreme co-design strategy
- Jensen Huang states that while NVIDIA wants world-class proprietary models as products, they also recognize that if everything is proprietary, it becomes hard for researchers to innovate and build upon existing work
- Huang emphasizes that NVIDIA has the scale, skills, and motivation to continue building AI models indefinitely, which gives them the capability to activate every industry, researcher, and country to join the AI revolution
- Jensen Huang argues that AI extends beyond language and will likely use tools and models trained on diverse modalities like biology, chemistry, and physics, requiring specialized AI development for different domains like weather prediction and drug discovery
Topics
Full transcript available for MurmurCast members
Sign Up to Access