On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation — arXiv2