Statistical methods for clustered competing risk data when the event types are only available in a training dataset — arXiv2