SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling
cs.LG
/ Authors
Fanjiang Ye, Zepeng Zhao, Yi Mu, Jucheng Shen, Renjie Li, Kaijian Wang, Saurabh Agarwal, Myungjin Lee, Triston Cao, Aditya Akella
and 4 more authors
Arvind Krishnamurthy, T. S. Eugene Ng, Zhengzhong Tu, Yuke Wang
/ Abstract
Diffusion models have recently achieved remarkable success in generative tasks (e.g., image and video generation), and the demand for high-quality content (e.g., 2K/4K videos) is rapidly increasing across various domains. However, generating ultra-high-resolution videos on existing standard-resolution (e.g., 720p) platforms remains challenging due to the excessive re-training requirements and prohibitively high computational and memory costs. To this end, we introduce SUPERGEN, an efficient tile-based framework for ultra-high-resolution video generation. SUPERGEN features a novel training-free algorithmic innovation with tiling to successfully support a wide range of resolutions without additional training efforts while significantly reducing both memory footprint and computational complexity. Moreover, SUPERGEN incorporates a tile-tailored, adaptive, region-aware caching strategy that accelerates video generation by exploiting redundancy across denoising steps and spatial regions. SUPERGEN also integrates cache-guided, communication-minimized tile parallelism for enhanced throughput and minimized latency. Evaluations show that SUPERGEN maximizes performance gains while achieving high output quality across various benchmarks.