DG-RePlAce: A Dataflow-Driven GPU-Accelerated Analytical Global Placement Framework for Machine Learning Accelerators
/ Authors
/ Abstract
Global placement (GP) is a fundamental step in VLSI physical design. The wide use of 2-D processing element (PE) arrays in machine learning accelerators poses new challenges of scalability and quality of results (QoR) for state-of-the-art academic global placers. In this work, we develop DG-RePlAce, a new and fast GPU-accelerated GP framework built on top of the OpenROAD infrastructure, which exploits the inherent dataflow and datapath structures of machine learning accelerators. Experimental results with a variety of machine learning accelerators using a commercial 12-nm enablement show that, compared with RePlAce (DREAMPlace), our approach achieves an average reduction in routed wirelength by <inline-formula> <tex-math notation="LaTeX">$10\%~(7\%)$ </tex-math></inline-formula> and total negative slack (TNS) by <inline-formula> <tex-math notation="LaTeX">$31\%~(34\%)$ </tex-math></inline-formula>, with faster GP and on-par total runtimes relative to DREAMPlace. Empirical studies on the TILOS MacroPlacement Benchmarks further demonstrate that post-route improvements over RePlAce and DREAMPlace may reach beyond the motivating application to machine learning accelerators.
Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems