"au:"Junyuan Shang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Junyuan Shang"" — arXiv2 Search

Showing 1–19 of 19 results

/ Date/ Name

Jun 2, 2019Pre-training of Graph Augmented Transformers for Medication Recommendation Jun 3, 2024DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion Jan 9, 2026VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Nov 27, 2022X-PuDu at SemEval-2022 Task 7: A Replaced Token Detection Task Pre-trained Model with Pattern-aware Ensembling for Identifying Plausible Clarifications Aug 7, 2024NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time Sep 6, 2018GAMENet: Graph Augmented MEmory Networks for Recommending Medication Combination Feb 19, 2025Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking Mar 5, 2026Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation Dec 23, 2021ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation Sep 29, 2024BiPC: Bidirectional Probability Calibration for Unsupervised Domain Adaption Dec 7, 2024Mixture of Hidden-Dimensions Transformer Mar 25, 2026Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention Looping Oct 4, 2019GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural Networks Jul 5, 2021ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation Dec 31, 2020ERNIE-Doc: A Retrospective Long-Document Modeling Transformer Aug 9, 2019K-margin-based Residual-Convolution-Recurrent Neural Network for Atrial Fibrillation Detection Dec 28, 2019Opportunities and Challenges of Deep Learning Methods for Electrocardiogram Data: A Systematic Review Oct 17, 2024MoR: Mixture of Ranks for Low-Rank Adaptation Tuning Feb 4, 2026ERNIE 5.0 Technical Report