"au:"Roberta Raileanu"" — arXiv2 SearchShowing 1–7 of 7 results
/ Date/ Name
Feb 6, 2026AIRS-Bench: a Suite of Tasks for Frontier AI Research Science AgentsNov 17, 2025Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM PerformanceJul 3, 2025AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-benchJun 27, 2025The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT ImprovementsFeb 20, 2025MLGym: A New Framework and Benchmark for Advancing AI Research AgentsJul 31, 2024The Llama 3 Herd of ModelsFeb 13, 2024GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements