Showing 1–20 of 21 results
/ Date/ Name
Nov 11, 2024ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem SolvingOct 19, 2022Museformer: Transformer with Fine- and Coarse-Grained Attention for Music GenerationFeb 14, 2024LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning DatasetSep 4, 2024MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding BenchmarkMay 29, 2025Probing Association Biases in LLM Moderation Over-SensitivityDec 17, 2025Evaluating Large Language Models in Scientific DiscoveryAug 30, 2022MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural NetworksSep 5, 2021Knowing False Negatives: An Adversarial Training Method for Distantly Supervised Relation ExtractionNov 27, 2023MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGIDec 25, 2025Accelerating Scientific Discovery with Autonomous Goal-evolving AgentsOct 13, 2025Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent EvaluationMay 31, 2023MuseCoco: Generating Symbolic Music from TextDec 31, 2024Achieving Carbon Neutrality for I/O DevicesJun 26, 2025Mind2Web 2: Evaluating Agentic Search with Agent-as-a-JudgeJun 9, 2025AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-ScientistsJul 3, 2023EmoGen: Eliminating Subjective Bias in Emotional Music GenerationAug 16, 2025LARC: Towards Human-level Constrained Retrosynthesis Planning through an Agentic FrameworkMay 7, 2026A Versatile AI Agent for Rare Disease Diagnosis and Risk Gene PrioritizationMay 8, 2026ARMOR: An Agentic Framework for Reaction Feasibility Prediction via Adaptive Utility-aware Multi-tool ReasoningOct 7, 2024ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery