Showing 1–20 of 26 results
/ Date/ Name
Oct 31, 2023InstructCoder: Instruction Tuning Large Language Models for Code EditingApr 15, 2024MMCode: Benchmarking Multimodal Large Language Models for Code Generation with Visually Rich Programming ProblemsDec 8, 2021Efficient Batch Homomorphic Encryption for Vertically Federated XGBoostFeb 18, 2025MCTS-Judge: Test-Time Scaling in LLM-as-a-Judge for Code Correctness EvaluationNov 26, 2025Qwen3-VL Technical ReportOct 14, 2025HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application VulnerabilitiesSep 30, 2024Robi Butler: Multimodal Remote Interaction with a Household Robot AssistantFeb 27, 2025Thermodynamic speed limit for non-adiabatic work and its classical-quantum decompositionAug 5, 2025Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon PlanningNov 12, 2025MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal CritiqueMar 2, 2026Towards Principled Dataset Distillation: A Spectral Distribution PerspectiveApr 22, 2024Towards Better Text-to-Image Generation Alignment via Attention ModulationMay 18, 2025Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement LearningOct 31, 2025MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language ModelsApr 4, 2025ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer UseSep 15, 2025SpeCa: Accelerating Diffusion Transformers with Speculative Feature CachingAug 13, 2025AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries?Feb 28, 2026Qwen3-Coder-Next Technical ReportJul 2, 2025AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on HarmfulnessMay 18, 2025Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning