"au:"Mohammad Shoeybi"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Mohammad Shoeybi"" — arXiv2 Search

Showing 21–40 of 80 results

/ Date/ Name

Mar 16, 2022Multi-Stage Prompting for Knowledgeable Dialogue Generation Oct 25, 2022Evaluating Parameter Efficient Learning for Generation Feb 9, 2023Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning Apr 13, 2023Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study Oct 11, 2023InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining Jul 2, 2024RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Jul 19, 2024ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Feb 26, 2024Nemotron-4 15B Technical Report Dec 3, 2024Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Nov 4, 2024MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs Apr 8, 2025From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models Jan 26, 2026LatentMoE: Toward Optimal Accuracy per FLOP and Parameter in Mixture of Experts Aug 20, 2025Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset Nov 9, 2022BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Jun 17, 2024Nemotron-4 340B Technical Report Apr 14, 2026Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Jul 9, 2024Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models Aug 15, 2023RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models Jun 12, 2024An Empirical Study of Mamba-based Language Models May 22, 2025AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

← Previous Next →