Showing 1–16 of 16 results
/ Date/ Name
Feb 26, 2024Nemotron-4 15B Technical ReportNov 20, 2025Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMsApr 15, 2025Minitron-SSM: Efficient Hybrid Language Model Compression through Group-Aware SSM PruningApr 4, 2025Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer ModelsNov 20, 2024Hymba: A Hybrid-head Architecture for Small Language ModelsJun 17, 2024Nemotron-4 340B Technical ReportApr 14, 2026Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic ReasoningApr 26, 2025When2Call: When (not) to Call ToolsNov 6, 2025NVIDIA Nemotron Nano V2 VLMay 2, 2025Llama-Nemotron: Efficient Reasoning ModelsJan 27, 2026Quantization-Aware Distillation for NVFP4 Inference Accuracy RecoveryJun 4, 2025Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video GamesAug 21, 2024LLM Pruning and Distillation in Practice: The Minitron ApproachAug 20, 2025NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning ModelDec 24, 2025NVIDIA Nemotron 3: Efficient and Open IntelligenceDec 23, 2025Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning