Showing 21–40 of 48 results
/ Date/ Name
Jun 17, 2021DeepLab2: A TensorFlow Library for Deep LabelingFeb 8, 2021TransUNet: Transformers Make Strong Encoders for Medical Image SegmentationMay 30, 2022TubeFormer-DeepLab: Video Mask TransformerJun 12, 2023Compositor: Bottom-up Clustering and Compositing for Robust Part and Object SegmentationMar 30, 2023A Study of Autoregressive Decoders for Multi-Tasking in Computer VisionApr 12, 2024COCONut: Modernizing COCO SegmentationFeb 26, 2025Dictionary-based Framework for Interpretable and Consistent Object ParsingApr 30, 2025ReVision: Refining Video Diffusion with Explicit 3D Motion ModelingNov 25, 2020Can Temporal Information Help with Contrastive Self-Supervised Learning?Oct 4, 2022MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision ModelsFeb 19, 2020When Radiology Report Generation Meets Knowledge GraphFeb 4, 2025COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and GenerationFeb 27, 2025Beyond Next-Token: Next-X Prediction for Autoregressive Visual GenerationJan 13, 2025Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional TokensApr 6, 2026A Frame is Worth One Token: Efficient Generative World Modeling with Delta TokensApr 16, 2026Frequency-Aware Flow Matching for High-Quality Image GenerationDec 2, 2021PartImageNet: A Large, High-Quality Dataset of PartsMar 13, 2025FlowTok: Flowing Seamlessly Across Text and Image TokensJun 4, 2024Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian SplattingJun 29, 2023ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation