To Compress or Not To Compress: Energy Trade-Offs and Benefits of Lossy Compressed I/O
cs.DC
/ Authors
/ Abstract
Modern scientific simulations generate massive volumes of data, creating significant challenges for I/O and storage systems. Error-bounded lossy compression (EBLC) offers a solution by reducing data set sizes while preserving data quality within user-specified limits. This study provides the first comprehensive energy characterization of state-of-the-art EBLC algorithms--SZ2, SZ3, ZFP, QoZ, and SZx--across various scientific data sets, CPU generations, and parallel/serial modes. We analyze the energy consumption patterns of compression and decompression operations, as well as the energy trade-offs in data I/O scenarios. Our work demonstrates the relationships between compression ratios, runtime, energy efficiency, and data quality, highlighting the importance of considering compressors and error bounds for specific use cases. We demonstrate that EBLC can significantly reduce I/O energy consumption, with savings of up to two orders of magnitude compared to uncompressed I/O for large data sets. In multi-node HPC environments, we observe energy reductions of approximately 25% when using EBLC. We also show that EBLC can achieve compression ratios of 10-100x, potentially reducing storage device requirements by nearly two orders of magnitude. This work provides a framework for system operators and computational scientists to make informed decisions about implementing EBLC for energy-efficient data management in HPC environments.