arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Baixi Sun"" — arXiv2 Search
Showing 1–7 of 7 results
/ Date
/ Name
Nov 1, 2022
SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Jul 1, 2024
FastCLIP: A Suite of Optimization Techniques to Accelerate CLIP Training with Limited Resources
Sep 1, 2025
STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data
Oct 20, 2024
SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Apr 14, 2023
HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Sep 29, 2023
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Dec 19, 2024
GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors