arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Jillian Bommarito"" — arXiv2 Search
Showing 1–5 of 5 results
/ Date
/ Name
Mar 21, 2025
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
Apr 10, 2025
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models
Jan 11, 2023
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities
Jan 14, 2025
Towards Best Practices for Open Datasets for LLM Training
Apr 5, 2025
Precise Legal Sentence Boundary Detection for Retrieval at Scale: NUPunkt and CharBoundary