AI Market Logo
BTC $43,552.88 -0.46%
ETH $2,637.32 +1.23%
BNB $312.45 +0.87%
SOL $92.40 +1.16%
XRP $0.5234 -0.32%
ADA $0.8004 +3.54%
AVAX $32.11 +1.93%
DOT $19.37 -1.45%
MATIC $0.8923 +2.67%
LINK $14.56 +0.94%
HAIA $0.1250 +2.15%
BTC $43,552.88 -0.46%
ETH $2,637.32 +1.23%
BNB $312.45 +0.87%
SOL $92.40 +1.16%
XRP $0.5234 -0.32%
ADA $0.8004 +3.54%
AVAX $32.11 +1.93%
DOT $19.37 -1.45%
MATIC $0.8923 +2.67%
LINK $14.56 +0.94%
HAIA $0.1250 +2.15%
Build More Accurate and Efficient AI Agents with the New NVIDIA Llama Nemotron Super v1.5
ai-agents

Build More Accurate and Efficient AI Agents with the New NVIDIA Llama Nemotron Super v1.5

Discover NVIDIA Llama Nemotron Super v1.5, delivering top accuracy and efficiency for reasoning and agentic AI tasks.

July 26, 2025
5 min read
Udi Karpas

Discover NVIDIA Llama Nemotron Super v1.5, delivering top accuracy and efficiency for reasoning and agentic AI tasks.

The NVIDIA Nemotron family builds on the strongest open models in the ecosystem by enhancing them with greater accuracy, efficiency, and transparency using NVIDIA open synthetic datasets, advanced techniques, and tools. Today, we’re introducing NVIDIA Llama Nemotron Super v1.5, which brings significant improvements across core reasoning and agentic tasks like math, science, coding, function calling, instruction following, and chat, while maintaining strong throughput and compute efficiency.

Built for reasoning and agentic workloads

Llama Nemotron Super v1.5 builds on the same efficient reasoning foundation as Llama Nemotron Ultra. However, the model has been refined through post-training using a new dataset focused specifically on high-signal reasoning tasks. Across a wide range of benchmarks, Llama Nemotron Super v1.5 outperforms other open models in its weight class, particularly in tasks that require multi-step reasoning and structured tool use.
Figure 1. Llama Nemotron Super v1.5 delivers the highest accuracy for reasoning and agentic tasks.
To boost throughput and deployment efficiency, pruning techniques such as neural architecture search were applied. Higher throughput means the model can reason faster and explore more complex problem spaces within the same compute and time budget—delivering stronger reasoning at lower inference costs. It also runs on a single GPU, further reducing compute overhead.
Figure 2. Llama Nemotron Super v1.5 provides the highest accuracy and throughput for agentic tasks, lowering the cost of inference.

Try the model now

Experience Llama Nemotron Super v1.5 at build.nvidia.com, or download the model directly from Hugging Face.

Related resources