"au:"Lester Miranda"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Lester Miranda"" — arXiv2 Search

Showing 1–20 of 21 results

/ Date/ Name

Jul 20, 2024Consent in Crisis: The Rapid Decline of the AI Data Commons Dec 19, 2022Multi hash embeddings in spaCy Aug 5, 2025FilBench: Can LLMs Understand and Generate Filipino?May 26, 2025The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project Oct 20, 2024M-RewardBench: Evaluating Reward Models in Multilingual Settings Nov 22, 2024Tulu 3: Pushing Frontiers in Open Language Model Post-Training Apr 13, 2026Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation Nov 13, 2023calamanCy: A Tagalog Natural Language Processing Toolkit Apr 23, 2026Multilinguality at the Edge: Developing Language Models for the Global South Oct 24, 2024Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback Dec 31, 20242 OLMo 2 Furious Oct 12, 2019Geomancer: An Open-Source Framework for Geospatial Feature Engineering Nov 13, 2023Developing a Named Entity Recognition Dataset for Tagalog Dec 19, 2024Bridging the Data Provenance Gap Across Text, Speech and Video Feb 19, 2025MMTEB: Massive Multilingual Text Embedding Benchmark Nov 15, 2023Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Mar 20, 2024RewardBench: Evaluating Reward Models for Language Modeling Mar 10, 2025Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Dec 15, 2025Olmo 3 May 19, 2025R3: Robust Rubric-Agnostic Reward Models