"au:"Wenxuan Huang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Wenxuan Huang"" — arXiv2 Search

Showing 1–20 of 42 results

/ Date/ Name

Jan 29, 2026Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Sep 8, 2025Interleaving Reasoning for Better Text-to-Image Generation Mar 9, 2025Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models Mar 19, 2026Confidential Databases Without Cryptographic Mappings Mar 31, 2024A General and Efficient Training for Transformer via Token Expansion Jul 16, 2018An L$_0$L$_1$-norm compressive sensing paradigm for the construction of sparse predictive lattice models using mixed integer quadratic programming Mar 10, 2025LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?Dec 1, 2024Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification Jun 23, 2016Constructing and proving the ground state of a generalized Ising model by the cluster tree optimization algorithm Apr 22, 2016Finding and proving the exact ground state of a generalized Ising model by convex optimization and MAX-SAT Jul 1, 2023Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler Mar 2, 2026HarmonyCell: Automating Single-Cell Perturbation Modeling under Semantic and Distribution Shifts Feb 2, 2026Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Oct 7, 2025Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?Jun 12, 2025Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Jun 18, 2025AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System Need May 8, 2025ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation Mar 1, 2026GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant Feb 13, 2026VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph Apr 19, 2026SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents