Information-Theoretic Local Minima Characterization and Regularization — arXiv2