How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN — arXiv2