Are Large-scale Datasets Necessary for Self-Supervised Pre-training? — arXiv2