Showing 1–20 of 62 results
/ Date/ Name
Feb 2, 2024Need a Small Specialized Language Model? Plan Early!May 16, 2018QuaterNet: A Quaternion-based Recurrent Model for Human MotionSep 5, 2024The AdEMAMix Optimizer: Better, Faster, OlderOct 3, 2024Dynamic Gradient Alignment for Online Data MixingSep 15, 2021On the Complementarity of Data Selection and Fine Tuning for Domain AdaptationMar 24, 2016Neural Text Generation from Structured Data with Application to the Biography DomainDec 15, 2015Strategies for Training Large Vocabulary Neural Language ModelsNov 14, 2022High-Resource Methodological Bias in Low-Resource InvestigationsOct 20, 2020Human-Paraphrased References Improve Neural Machine TranslationNov 14, 2017Classical Structured Prediction Losses for Sequence to Sequence LearningJun 1, 2018Scaling Neural Machine TranslationApr 29, 2021Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine TranslationSep 21, 2021The Trade-offs of Domain Adaptation for Neural Language ModelsSep 16, 2014ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided ProblemsNov 20, 2023Adaptive Training Distributions with Scalable Online Bilevel OptimizationNov 14, 2017Controllable Abstractive SummarizationNov 13, 2017QuickEdit: Editing Text & Translations by Crossing Words OutFeb 3, 2025Soup-of-Experts: Pretraining Specialist Models via Parameters AveragingSep 30, 2024Task-Adaptive Pretrained Language Models via Clustered-Importance SamplingNov 10, 2023Transfer Learning for Structured Pruning under Limited Task Data