arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Le Thien Phuc Nguyen"" — arXiv2 Search
Showing 1–5 of 5 results
/ Date
/ Name
May 28, 2025
UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios
Jan 21, 2025
LASER: Lip Landmark Assisted Speaker Detection for Robustness
Aug 2, 2025
GMAT: Grounded Multi-Agent Clinical Description Generation for Text Encoder in Vision-Language MIL for Whole Slide Image Classification
Dec 1, 2025
See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models
Jul 16, 2025
Describe Anything Model for Visual Question Answering on Text-rich Images