Showing 21–40 of 80 results
/ Date/ Name
Mar 16, 2022Multi-Stage Prompting for Knowledgeable Dialogue GenerationOct 25, 2022Evaluating Parameter Efficient Learning for GenerationFeb 9, 2023Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image CaptioningApr 13, 2023Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive StudyOct 11, 2023InstructRetro: Instruction Tuning post Retrieval-Augmented PretrainingJul 2, 2024RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMsJul 19, 2024ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG CapabilitiesFeb 26, 2024Nemotron-4 15B Technical ReportDec 3, 2024Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining DatasetNov 4, 2024MM-Embed: Universal Multimodal Retrieval with Multimodal LLMsApr 8, 2025From 128K to 4M: Efficient Training of Ultra-Long Context Large Language ModelsJan 26, 2026LatentMoE: Toward Optimal Accuracy per FLOP and Parameter in Mixture of ExpertsAug 20, 2025Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining DatasetNov 9, 2022BLOOM: A 176B-Parameter Open-Access Multilingual Language ModelJun 17, 2024Nemotron-4 340B Technical ReportApr 14, 2026Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic ReasoningJul 9, 2024Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language ModelsAug 15, 2023RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language ModelsJun 12, 2024An Empirical Study of Mamba-based Language ModelsMay 22, 2025AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning