Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training — arXiv2