Showing 1–20 of 34 results
/ Date/ Name
Dec 10, 2020What Makes for End-to-End Object Detection?Jun 10, 2024Autoregressive Model Beats Diffusion: Llama for Scalable Image GenerationMay 18, 2023Going Denser with Open-Vocabulary Part SegmentationNov 25, 2020Sparse R-CNN: End-to-End Object Detection with Learnable ProposalsNov 29, 2021DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse MotionDec 31, 2020TransTrack: Multiple Object Tracking with TransformerJul 21, 2023Enhancing Your Trained DETRs with Box RefinementJul 14, 2022Towards Grand Unification of Object TrackingMar 27, 2023ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection BoxSep 16, 2019TextSR: Content-Aware Text Super-Resolution Guided by RecognitionJan 3, 2022Language as Queries for Referring Video Object SegmentationFeb 25, 2024RoboCodeX: Multimodal Code Generation for Robotic Behavior SynthesisApr 24, 2025Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive ModelsJul 7, 2023GPT4RoI: Instruction Tuning Large Language Model on Region-of-InterestNov 20, 2025SAM 3: Segment Anything with ConceptsSep 29, 2019PolarMask: Single Shot Instance Segmentation with Polar RepresentationOct 13, 2021ByteTrack: Multi-Object Tracking by Associating Every Detection BoxFeb 9, 2021DetCo: Unsupervised Contrastive Learning for Object DetectionNov 17, 2022DiffusionDet: Diffusion Model for Object DetectionJul 10, 2024IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model