Showing 1–11 of 11 results
/ Date/ Name
Jan 19, 2026A Benchmark for Language Models in Real-World System BuildingNov 2, 2025Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World SystemsJul 12, 2025Enhancing Interpretability in Software Change Management with Chain-of-Thought ReasoningJul 9, 2024A Scenario-Oriented Benchmark for Assessing AIOps Algorithms in Microservice ManagementJun 27, 2024Failure Diagnosis in Microservice Systems: A Comprehensive Survey and AnalysisFeb 16, 2024TimeSeriesBench: An Industrial-Grade Benchmark for Time Series Anomaly Detection ModelsOct 11, 2023OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language ModelsAug 1, 2023A Survey of Time Series Anomaly Detection Methods in the AIOps DomainMay 29, 2023Assess and Summarize: Improve Outage Understanding with Large Language ModelsFeb 21, 2023Robust Failure Diagnosis of Microservice System through Multimodal DataAug 8, 2022Constructing Large-Scale Real-World Benchmark Datasets for AIOps