Showing 1–20 of 39 results
/ Date/ Name
Apr 3, 2019A Comprehensive Overhaul of Feature DistillationFeb 6, 2026MuCo: Multi-turn Contrastive Learning for Multimodal Embedding ModelMar 30, 2021Rethinking Spatial Dimensions of Vision TransformersDec 8, 2021Joint Global and Local Hierarchical Priors for Learned Image CompressionMar 20, 2024Rotary Position Embedding for Vision TransformerNov 8, 2018Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden NeuronsMay 15, 2018Knowledge Distillation with Adversarial Samples Supporting Decision BoundaryJun 20, 2023Masking meets Supervision: A Strong Learning AllianceJun 15, 2020AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant WeightsJul 2, 2020Rethinking Channel Dimensions for Efficient Model DesignMar 20, 2023SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel StorageDec 15, 2023SeiT++: Masked Token Modeling Improves Storage-efficient TrainingDec 30, 2023Morphing Tokens Draw Strong Masked Image ModelsJul 9, 2025Token Bottleneck: One Token to Remember DynamicsOct 8, 2021ViDT: An Efficient and Effective Fully Transformer-based Object DetectorOct 17, 2025Exploring Conditions for Diffusion models in Robotic ControlDec 7, 2020VideoMix: Rethinking Data Augmentation for Video ClassificationApr 1, 2024Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text GuidanceMay 1, 2023What Do Self-Supervised Vision Transformers Learn?Apr 17, 2022An Extendable, Efficient and Effective Transformer-based Object Detector