arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Mostofa Patwary"" — arXiv2 Search
Showing 1–7 of 7 results
/ Date
/ Name
Apr 14, 2026
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Nov 6, 2025
NVIDIA Nemotron Nano V2 VL
Oct 27, 2025
Multi-Agent Evolve: LLM Self-Improve through Co-evolution
Dec 3, 2024
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
Feb 29, 2024
StarCoder 2 and The Stack v2: The Next Generation
Sep 25, 2019
DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Sep 17, 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism