Showing 1–20 of 39 results
/ Date/ Name
Dec 23, 2021ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and GenerationDec 31, 2020ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual CorporaJul 5, 2021ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and GenerationDec 31, 2020ERNIE-Doc: A Retrospective Long-Document Modeling TransformerOct 7, 2020Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language ModelsFeb 9, 2023ERNIE-Music: Text-to-Waveform Music Generation with Diffusion ModelsDec 7, 2024Mixture of Hidden-Dimensions TransformerDec 3, 2025V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time InterventionSep 26, 2025Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-ExpertsNov 27, 2022X-PuDu at SemEval-2022 Task 7: A Replaced Token Detection Task Pre-trained Model with Pattern-aware Ensembling for Identifying Plausible ClarificationsNov 7, 2022ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-SpeechAug 7, 2024NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference TimeOct 3, 2024MA-RLHF: Reinforcement Learning from Human Feedback with Macro ActionsMar 25, 2026Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention LoopingJul 29, 2019ERNIE 2.0: A Continual Pre-training Framework for Language UnderstandingNov 30, 2022X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and Arabic Sarcasm DetectionDec 13, 2022ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming LanguagesApr 29, 2024HFT: Half Fine-Tuning for Large Language ModelsOct 2, 2024Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter MergingApr 16, 2024Autoregressive Pre-Training on Pixels and Texts