Stochastic Training is Not Necessary for Generalization — arXiv2