Showing 21–40 of 63 results
/ Date/ Name
Feb 3, 2022Learning strides in convolutional neural networksNov 28, 20183D human pose estimation in video with temporal convolutions and semi-supervised trainingMar 12, 2020Efficient Content-Based Sparse Attention with Routing TransformersFeb 20, 2020Wavesplit: End-to-End Speech Separation by Speaker ClusteringJul 22, 2019ELI5: Long Form Question AnsweringMay 29, 2019Unsupervised Paraphrasing without TranslationDec 23, 2016Language Modeling with Gated Convolutional NetworksJan 29, 2024Rephrasing the Web: A Recipe for Compute and Data-Efficient Language ModelingJun 15, 2025Assessing the Role of Data Quality in Training Bilingual Language ModelsNov 20, 2024Training Bilingual LMs with Data Constraints in the Targeted LanguageOct 1, 2025The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM PretrainingSep 29, 2025Pretraining with hierarchical memories: separating long-tail and common knowledgeApr 13, 2020BLEU might be Guilty but References are not InnocentJun 15, 2019Tagged Back-TranslationMay 11, 2020Toward Better Storylines with Sentence-Level Language ModelsNov 17, 2021High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural MetricsOct 20, 2016Iterative Refinement for Machine TranslationSep 26, 2025Partial Parameter Updates for Efficient Distributed TrainingMar 6, 2026Which Data Matter? Embedding-Based Data Selection for Speech RecognitionOct 21, 2020Contrastive Learning of General-Purpose Audio Representations