Showing 1–19 of 19 results
/ Date/ Name
Jun 2, 2019Pre-training of Graph Augmented Transformers for Medication RecommendationJun 3, 2024DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads FusionJan 9, 2026VideoAR: Autoregressive Video Generation via Next-Frame & Scale PredictionNov 27, 2022X-PuDu at SemEval-2022 Task 7: A Replaced Token Detection Task Pre-trained Model with Pattern-aware Ensembling for Identifying Plausible ClarificationsAug 7, 2024NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference TimeSep 6, 2018GAMENet: Graph Augmented MEmory Networks for Recommending Medication CombinationFeb 19, 2025Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal ThinkingMar 5, 2026Mixture of Universal Experts: Scaling Virtual Width via Depth-Width TransformationDec 23, 2021ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and GenerationSep 29, 2024BiPC: Bidirectional Probability Calibration for Unsupervised Domain AdaptionDec 7, 2024Mixture of Hidden-Dimensions TransformerMar 25, 2026Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention LoopingOct 4, 2019GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural NetworksJul 5, 2021ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and GenerationDec 31, 2020ERNIE-Doc: A Retrospective Long-Document Modeling TransformerAug 9, 2019K-margin-based Residual-Convolution-Recurrent Neural Network for Atrial Fibrillation DetectionDec 28, 2019Opportunities and Challenges of Deep Learning Methods for Electrocardiogram Data: A Systematic ReviewOct 17, 2024MoR: Mixture of Ranks for Low-Rank Adaptation TuningFeb 4, 2026ERNIE 5.0 Technical Report