Showing 61–80 of 169 results
/ Date/ Name
Nov 24, 2025SLMFix: Leveraging Small Language Models for Error Fixing with Reinforcement LearningNov 2, 2025Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World SystemsOct 24, 2025AgentBound: Securing Execution Boundaries of AI AgentsOct 21, 2025CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using AgentOct 20, 2025What Makes AI Research Replicable? Executable Knowledge Graphs as Scientific Knowledge RepresentationsOct 15, 2025OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case StudiesOct 6, 2025FreshBrew: A Benchmark for Evaluating AI Agents on Java Code MigrationOct 6, 2025DynamiQ: Unlocking the Potential of Dynamic Task Allocation in Parallel FuzzingSep 30, 2025CWM: An Open-Weights LLM for Research on Code Generation with World ModelsSep 26, 2025SecureVibeBench: Benchmarking Secure Vibe Coding of AI Agents via Reconstructing Vulnerability-Introducing ScenariosSep 15, 2025A Practical Adversarial Attack against Sequence-based Deep Learning Malware ClassifiersSep 12, 2025Enhancing LLM-based Specification Generation via Program Slicing and Logical DeletionAug 20, 2025Trace-Based Reconstruction of Quantum Circuit Dataflow in Surface CodesAug 4, 2025Flow Sensitivity without Control Flow Graph: An Efficient Andersen-Style Flow-Sensitive Pointer AnalysisJul 30, 2025From Articles to Code: On-Demand Generation of Core Algorithms from Scientific PublicationsJul 16, 2025GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version IncompatibilitiesJul 12, 2025Enhancing Interpretability in Software Change Management with Chain-of-Thought ReasoningJun 24, 2025Towards an Oracle for Binary Decomposition Under Compilation VarianceJun 11, 2025Reasoning as a Resource: Optimizing Fast and Slow Thinking in Code Generation ModelsJun 4, 2025LogSage: An LLM-Based Framework for CI/CD Failure Detection and Remediation with Industrial Validation