arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Shi Wang"" — arXiv2 Search
Showing 1–4 of 4 results
/ Date
/ Name
Feb 20, 2025
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Jun 21, 2024
GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models
Jun 18, 2023
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
Dec 9, 2022
HieNet: Bidirectional Hierarchy Framework for Automated ICD Coding