Initializing Models with Larger Ones — arXiv2