Showing 1–20 of 38 results
/ Date/ Name
Aug 5, 2025Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?Jun 16, 2024GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented UnderstandingNov 26, 2024Interleaved Scene Graphs for Interleaved Text-and-Image Generation AssessmentOct 17, 2025Paper2Web: Let's Make Your Paper Alive!Mar 3, 2025CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM WisdomSep 1, 2025Reinforced Visual Perception with ToolsDec 17, 2025Are We on the Right Way to Assessing LLM-as-a-Judge?Nov 12, 2023Aggregate, Decompose, and Fine-Tune: A Simple Yet Effective Factor-Tuning Method for Vision TransformerJan 11, 2024LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?Jul 1, 2024Self-Cognition in Large Language Models: An Exploratory StudyMar 21, 2025Judge Anything: MLLM as a Judge Across Any ModalityFeb 7, 2024MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language BenchmarkApr 7, 2025Seeking and Updating with Live Visual KnowledgeDec 4, 2024Perception Tokens Enhance Visual Reasoning in Multimodal Language ModelsJun 27, 2024DataGen: Unified Synthetic Dataset Generation via Large Language ModelsJun 17, 2025Optimizing Length Compression in Large Reasoning ModelsOct 10, 2016Controllable deposition of titanium dioxides onto carbon nanotubes in aqueous solutionsOct 3, 2024Justice or Prejudice? Quantifying Biases in LLM-as-a-JudgeJun 1, 2024HonestLLM: Toward an Honest and Helpful Large Language ModelJun 19, 2024Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings