arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Longyun Wu"" — arXiv2 Search
Showing 1–2 of 2 results
/ Date
/ Name
Feb 24, 2025
LongAttn: Selecting Long-context Training Data via Token-level Attention
Aug 31, 2025
Router Upcycling: Leveraging Mixture-of-Routers in Mixture-of-Experts Upcycling