Showing 1–20 of 24 results
/ Date/ Name
Dec 31, 2020XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer EncodersFeb 18, 2023How Good Are GPT Models at Machine Translation? A Comprehensive EvaluationJun 25, 2021DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual EncodersApr 28, 2023ResiDual: Transformer with Dual Residual ConnectionsNov 3, 2015Detecting Interrogative Utterances with Recurrent Neural NetworksOct 24, 2023Dissecting In-Context Learning of Translations in GPTsMay 26, 2023Do GPTs Produce Less Literal Translations?Jun 30, 2022Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y TranslationsSep 22, 2021Scalable and Efficient MoE Training for Multitask Multilingual ModelsNov 3, 2021Multilingual Machine Translation Systems from Microsoft for WMT21 Shared TaskOct 3, 2023Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and RobustnessAug 16, 2023FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMsAug 21, 2022Z-Code++: A Pre-trained Language Model Optimized for Abstractive SummarizationMay 28, 2022Gating Dropout: Communication-efficient Regularization for Sparsely Activated TransformersMar 1, 2025Efficiently Editing Mixture-of-Experts Models with Compressed ExpertsAug 30, 2023Task-Based MoE for Multitask Multilingual Machine TranslationNov 18, 2022Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale ProductionSep 20, 2023A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language ModelsOct 26, 2020FastFormers: Highly Efficient Transformer Models for Natural Language UnderstandingOct 6, 2020Multi-task Learning for Multilingual Neural Machine Translation