Showing 1–14 of 14 results
/ Date/ Name
Apr 27, 2022Generating Examples From CLI Usage: Can Transformers Help?May 23, 2023Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?Mar 8, 2022Learning to Reduce False Positives in Analytic Bug DetectorsJul 11, 2025SetupBench: Assessing Software Engineering Agents' Ability to Bootstrap Development EnvironmentsFeb 22, 2024Copilot Evaluation Harness: Evaluating LLM-Guided Software ProgrammingJun 27, 2022DeepPERF: A Deep Learning-Based Approach For Improving Software PerformanceMar 13, 2024AutoDev: Automated AI-Driven DevelopmentJun 14, 2025The SWE-Bench Illusion: When State-of-the-Art LLMs Remember Instead of ReasonJun 29, 2023RAPGen: An Approach for Fixing Code Inefficiencies in Zero-ShotJul 12, 2025When Developer Aid Becomes Security Debt: A Systematic Analysis of Insecure Behaviors in LLM Coding AgentsJul 21, 2025FaultLine: Automated Proof-of-Vulnerability Generation Using LLM AgentsDec 18, 2024Closing the Gap: A User Study on the Real-world Usefulness of AI-powered Vulnerability Detection & Repair in the IDEMar 10, 2025RefactorBench: Evaluating Stateful Reasoning in Language Agents Through CodeSep 28, 2025PerfBench: Can Agents Resolve Real-World Performance Bugs?