arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Mats Leon Richter"" — arXiv2 Search
Showing 1–3 of 3 results
/ Date
/ Name
Dec 5, 2024
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Sep 29, 2025
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources
Dec 13, 2024
Too Big to Fool: Resisting Deception in Language Models