Showing 1–20 of 56 results
/ Date/ Name
Oct 15, 2020Semantic Label Smoothing for Sequence to Sequence ProblemsJan 18, 2019Cold-start Playlist Recommendation with Multitask LearningApr 27, 2022ELM: Embedding and Logit Margins for Long-Tail LearningAug 31, 2018On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled DataJan 24, 2019Fairness risk measuresAug 17, 2017Revisiting revisits in trajectory recommendationOct 28, 2022When does mixup promote local linearity in learned representations?May 29, 2024Cascade-Aware Training of Language ModelsMar 5, 2020Does label smoothing mitigate label noise?Sep 20, 2019Online Hierarchical Clustering ApproximationsMay 21, 2020Why distillation helps: a statistical perspectiveApr 23, 2020Doubly-stochastic mining for heterogeneous retrievalOct 19, 2021When in Doubt, Summon the Titans: Efficient Inference with Large ModelsFeb 13, 2021Distilling Double DescentJul 9, 2021Training Over-parameterized Models with Non-decomposable ObjectivesJan 30, 2019Noise-tolerant fair classificationJun 8, 2018Monge blunts Bayes: Hardness Results for Adversarial TrainingJul 19, 2023The importance of feature preprocessing for differentially private linear optimizationJan 30, 2023On student-teacher deviations in distillation: does it pay to disobey?Feb 3, 2023ResMem: Learn what you can and memorize the rest