arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Sherry Shi"" — arXiv2 Search
Showing 1–8 of 8 results
/ Date
/ Name
Oct 3, 2025
Abstain and Validate: A Dual-LLM Policy for Reducing Noise in Agentic Program Repair
Nov 14, 2025
Towards a Human-in-the-Loop Framework for Reliable Patch Evaluation Using an LLM-as-a-Judge
Jan 27, 2026
Dynamic Cogeneration of Bug Reproduction Test in Agentic Program Repair
Dec 29, 2025
From Correctness to Collaboration: Toward a Human-Centered Framework for Evaluating AI Agent Behavior in Software Engineering
May 13, 2024
MetaReflection: Learning Instructions for Language Agents using Past Reflections
Jun 9, 2022
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
May 2, 2023
From Words to Code: Harnessing Data for Program Synthesis from Natural Language
Dec 24, 2022
Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text