Showing 1–20 of 29 results
/ Date/ Name
Jun 16, 2023Learning Space-Time Semantic CorrespondencesNov 20, 2015Deep End2End Voxel2Voxel PredictionDec 2, 2014Learning Spatiotemporal Features with 3D Convolutional NetworksNov 30, 2017A Closer Look at Spatiotemporal Convolutions for Action RecognitionApr 4, 2019Video Classification with Channel-Separated Convolutional NetworksAug 16, 2017ConvNet Architecture Search for Spatiotemporal Feature LearningDec 20, 2013EXMOVES: Classifier-based Features for Scalable Action RecognitionJun 23, 2016VideoMCC: a New Benchmark for Video ComprehensionNov 28, 2019Self-Supervised Learning by Cross-Modal Audio-Video ClusteringJun 6, 2019Learning Temporal Pose Estimation from Sparsely-Labeled VideosNov 25, 2025Layer-Aware Video Composition via Split-then-MergeJun 7, 2019Video Modeling with Correlation NetworksJun 10, 2019UniDual: A Unified Model for Image and Video UnderstandingApr 8, 2019SCSampler: Sampling Salient Clips from Video for Efficient Action RecognitionJan 29, 2017Transformation-Based Models of Video SequencesJun 30, 2018Cooperative Learning of Audio and Video Models from Self-Supervised SynchronizationJan 26, 2019DistInit: Learning Video Representations Without a Single Labeled VideoMay 2, 2019Large-scale weakly-supervised pre-training for video action recognitionJun 17, 2021Long-Short Temporal Contrastive Learning of Video TransformersApr 12, 2022Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity