Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers — arXiv2