"au:"Angelica Chen"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Angelica Chen"" — arXiv2 Search

Showing 1–20 of 23 results

/ Date/ Name

Apr 29, 2022Training Language Models with Language Feedback Feb 28, 2023EvoPrompting: Language Models for Code-Level Neural Architecture Search Oct 29, 2024Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks Mar 28, 2023Improving Code Generation by Training with Natural Language Feedback May 23, 2022SQuALITY: Building a Long-Document Summarization Dataset the Hard Way May 23, 2023Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs Mar 28, 2023Training Language Models with Language Feedback at Scale Aug 26, 2022What Do NLP Researchers Believe? Results of the NLP Community Metasurvey Feb 16, 2023Pretraining Language Models with Human Preferences May 21, 2019Generating Logical Forms from Graph Representations of Text and Entities Oct 15, 2021BBQ: A Hand-Built Bias Benchmark for Question Answering May 2, 2022Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection Apr 11, 2022Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions May 29, 2024Preference Learning Algorithms Do Not Learn Preference Rankings Dec 8, 2023Playing Large Games with Oracles and AI Debate Aug 18, 2023Latent State Models of Training Dynamics Sep 13, 2023Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs Nov 16, 2021Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair Jun 26, 2025Bridging Offline and Online Reinforcement Learning for LLMs Nov 17, 2025Generalist Foundation Models Are Not Clinical Enough for Hospital Operations