Nvidia unveils Llama Nemotron reasoning model family designed for powering AI agents

At GTC, Nvidia launched its openly available Llama Nemotron family of reasoning models, designed specifically for developers and enterprises looking to build advanced AI agents. The new models aim to provide a business-ready foundation for creating AI systems that can work independently or collaboratively with other AI agents to tackle complex tasks. As the name indicates, Llama Nemotron was built on Llama models.

To create Llama Nemotron, the base Llama models were enhanced through Nvidia's post-training process, targeting performance in math, coding, reasoning, and decision-making tasks requiring multiple steps. Nvidia reports that after post-training, the Llama Nemotron family displays up to 20% improved accuracy compared to base models and 5x faster inference speeds than other leading open reasoning models. This optimization enables models to handle more complex reasoning tasks while reducing operational costs.

The Llama Nemotron family is available in three sizes to accommodate different deployment scenarios:

Nano: Highest accuracy for PCs and edge devices
Super: Best balance of accuracy and throughput on a single GPU
Ultra: Maximum agentic accuracy on multi-GPU servers

Major tech players are already integrating these models into their platforms. Microsoft is adding them to Azure AI Foundry, SAP is enhancing its Business AI solutions and Joule copilot, and ServiceNow is using them to build more accurate AI agents. Other collaborators include Accenture, Amdocs, Atlassian, Box, Cadence, CrowdStrike, Deloitte, and IQVIA.

To support enterprise adoption, NVIDIA is also releasing new agentic AI tools as part of its AI Enterprise software platform, including the AI-Q Blueprint for connecting knowledge to AI agents, the AI Data Platform for enterprise infrastructure, new NIM microservices for inference optimization, and NeMo microservices for continuous learning.

The Nano and Super models are available now through NVIDIA's developer website and Hugging Face, with free access for development, testing, and research for NVIDIA Developer Program members. Enterprise production use is supported through NVIDIA AI Enterprise on accelerated infrastructure.

Subscribe

Nvidia unveils Llama Nemotron reasoning model family designed for powering AI agents

Comments

Read Next

LangChain announces a $125 million Series B on the eve of its third anniversary

WebAgents: the open-source framework enabling AI agent orchestration across the internet

The distributed computing framework Ray is now hosted by the PyTorch Foundation