arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Doug Downey"" — arXiv2 Search
Showing 1–6 of 6 results
/ Date
/ Name
Mar 6, 2026
Deep Research, Shallow Evaluation: A Case Study in Meta-Evaluation for Long-Form QA Benchmarks
Aug 26, 2025
Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning
Jun 10, 2024
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
Dec 19, 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
May 23, 2022
Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions
Sep 15, 2021
"It doesn't look good for a date": Transforming Critiques into Preferences for Conversational Recommendation Systems