Showing 1–20 of 21 results
/ Date/ Name
Apr 23, 2026CI-Work: Benchmarking Contextual Integrity in Enterprise LLM AgentsFeb 19, 2026Computer-Using World ModelJan 19, 2026A Benchmark for Language Models in Real-World System BuildingNov 2, 2025Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World SystemsJan 19, 2025Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud PlatformsJan 12, 2025AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous CloudsAug 1, 2024AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task GenerationJul 16, 2024Building AI Agents for Autonomous Clouds: Challenges and Design PrinciplesMay 24, 2024Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly DetectionFeb 8, 2024UFO: A UI-Focused Agent for Windows OS InteractionFeb 5, 2024Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency PerspectiveJan 24, 2024Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4Dec 19, 2023Xpert: Empowering Incident Management with Query Recommendations via Large Language ModelsNov 29, 2023TaskWeaver: A Code-First Agent FrameworkNov 27, 2023Rethinking Privacy in Machine Learning Pipelines from an Information Flow Control PerspectiveNov 7, 2023Everything of Thoughts: Defying the Law of Penrose Triangle for Thought GenerationOct 28, 2023TraceDiag: Adaptive, Interpretable, and Efficient Root Cause Analysis on Large-Scale Microservice SystemsJul 3, 2023ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly DetectionMay 29, 2023Assess and Summarize: Improve Outage Understanding with Large Language ModelsMay 25, 2023Automatic Root Cause Analysis via Large Language Models for Cloud Incidents