"au:"Niklas Muennighoff"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Niklas Muennighoff"" — arXiv2 Search

Showing 21–40 of 65 results

/ Date/ Name

Feb 15, 2024Generative Representational Instruction Tuning Nov 7, 2024Scaling Laws for Precision Feb 18, 2024KMMLU: Measuring Massive Multitask Language Understanding in Korean Jul 1, 2024RegMix: Data Mixture as Regression for Language Model Pre-training Mar 31, 2025A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?Aug 25, 2025UQ: Assessing Language Models on Unsolved Questions Oct 11, 2025HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks Dec 6, 2021NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation Feb 17, 2026MAEB: Massive Audio Embedding Benchmark Nov 9, 2022BLOOM: A 176B-Parameter Open-Access Multilingual Language Model May 23, 2024Lessons from the Trenches on Reproducible Evaluation of Language Models Oct 27, 2022What Language Model to Train if You Have One Million GPU Hours?Jun 17, 2024DataComp-LM: In search of the next generation of training sets for language models Jul 23, 2024OpenHands: An Open Platform for AI Software Developers as Generalist Agents Jun 4, 2024The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding Jun 22, 2024BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Oct 24, 2025ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality Mar 25, 2026Composer 2 Technical Report Jan 24, 2025Humanity's Last Exam Oct 4, 2024SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

← Previous Next →