Showing 21–40 of 42 results
/ Date/ Name
Sep 8, 2025The Majority is not always right: RL training for solution aggregationJan 29, 2026Self-Improving Pretraining: using post-trained models to pretrain better modelsFeb 14, 2025Post-training an LLM for RAG? Train on Self-Generated DemonstrationsAug 5, 2024Self-Taught EvaluatorsJun 4, 2024Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech TranslationJul 2, 2025NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning TasksDec 23, 2025Safety Alignment of LMs via Non-cooperative GamesMar 23, 2024Controllable bipolaron formation unveiling structural features of trap states in organic charge transportFeb 6, 2020Consistency of a Recurrent Language Model With Respect to Incomplete DecodingApr 1, 2022Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence ModelsNov 10, 2019Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood TrainingNov 14, 2024Adaptive Decoding via Latent Preference OptimizationOct 8, 2025Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be DenseFeb 20, 2025Learning to Solve and Verify: A Self-Play Framework for Code and Test GenerationAug 12, 2019Neural Text Generation with Unlikelihood TrainingOct 18, 2022Simple and Effective Unsupervised Speech TranslationMar 19, 2024MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style PreservationDec 8, 2023Seamless: Multilingual Expressive and Streaming Speech TranslationAug 7, 2025Learning to Reason for FactualityJul 31, 2025CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks