Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning — arXiv2