"au:"Wenhao Huang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Wenhao Huang"" — arXiv2 Search

Showing 1–20 of 26 results

/ Date/ Name

Mar 27, 2026Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation Jan 9, 2026The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Nov 18, 2025First measurement of reactor neutrino oscillations at JUNO Nov 18, 2025Initial performance results of the JUNO detector Nov 14, 2025DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains Sep 30, 2025Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation Sep 4, 2025Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?May 29, 2025ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding May 20, 2025KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation May 11, 2025Seed1.5-VL Technical Report Apr 10, 2025Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Feb 20, 2025SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Jun 21, 2024GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models May 28, 2024Potential to identify neutrino mass ordering with reactor antineutrinos at JUNO May 28, 2024Prediction of Energy Resolution in the JUNO Experiment May 28, 2024JUNO Sensitivity to Invisible Decay Modes of Neutrons Oct 1, 2023TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks Sep 13, 2023Real-time Monitoring for the Next Core-Collapse Supernova in JUNO Mar 9, 2023The JUNO experiment Top Tracker Dec 16, 2022JUNO Sensitivity on Proton Decay $p\to \barνK^+$ Searches