Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering — arXiv2