Showing 61–80 of 92 results
/ Date/ Name
Oct 7, 2022C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video RetrievalOct 3, 2022Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker DetectionSep 18, 2022Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar InstancesSep 2, 2022Geometry Aligned Variational Transformer for Image-conditioned Layout GenerationJul 11, 2022Patch-level instance-group discrimination with pretext-invariant learning for colitis scoringMay 9, 2022SwinIQA: Learned Swin Distance for Compressed Image Quality AssessmentMay 4, 2022Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph CompletionMar 30, 2022Span Classification with Structured Information for Disfluency Detection in Spoken UtterancesMar 4, 2022Voice-Face Homogeneity Tells DeepfakeJan 16, 2022Audio-Driven Talking Face Video Generation with Dynamic Convolution KernelsDec 3, 2021Malakai: Music That Adapts to the Shape of EmotionsOct 27, 2021LSTM-RPA: A Simple but Effective Long Sequence Prediction Algorithm for Music Popularity PredictionOct 13, 2021Singer separation for karaoke content generationSep 20, 2021TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage MethodJul 27, 2021The CORSMAL benchmark for the prediction of the properties of containersJul 6, 2021Self-Adversarial Training incorporating Forgery Attention for Image Forgery LocalizationJul 1, 2021Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical DataMar 10, 2021Cross-modal Image Retrieval with Deep Mutual Information MaximizationDec 16, 2020UAV-Assisted Image Acquisition: 3D UAV Trajectory Design and Camera ControlDec 20, 2019From Patches to Pictures (PaQ-2-PiQ): Mapping the Perceptual Space of Picture Quality