Policy Iteration for Exploratory Hamilton--Jacobi--Bellman Equations — arXiv2