arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Harsh Trivedi"" — arXiv2 Search
Showing 1–5 of 5 results
/ Date
/ Name
Jan 17, 2026
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Dec 15, 2025
Olmo 3
Oct 5, 2022
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
May 2, 2020
Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected Reasoning
Apr 20, 2019
Repurposing Entailment for Multi-Hop Question Answering Tasks