"au:"Amar Budhiraja"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Amar Budhiraja"" — arXiv2 Search

Showing 1–6 of 6 results

/ Date/ Name

Feb 12, 2026Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments Feb 6, 2026AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents Nov 19, 2025What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity Nov 17, 2025Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Sep 21, 2025ARE: Scaling Up Agent Environments and Evaluations Feb 20, 2025MLGym: A New Framework and Benchmark for Advancing AI Research Agents