Showing 1–20 of 46 results
/ Date/ Name
Aug 20, 2018Adversarial Removal of Demographic Attributes from Text DataFeb 1, 2021Measuring and Improving Consistency in Pretrained Language ModelsNov 16, 2023Measuring and Improving Attentiveness to Partial Inputs with CounterfactualsJul 31, 2024Data Contamination Report from the 2024 CONDA Shared TaskOct 24, 2024Hybrid Preferences: Learning to Route Instances for Human vs. AI FeedbackApr 16, 2025On Linear Representations and Pretraining Data Frequency in Language ModelsJan 31, 2024Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining ResearchJul 10, 2018Privacy and Fairness in Recommender Systems via Adversarial Training of User RepresentationsJun 4, 2019How Large Are Lions? Inducing Distributions over Quantitative AttributesJun 1, 2020Amnesic Probing: Behavioral Explanation with Amnesic CounterfactualsApr 16, 2021Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd SchemaSep 24, 2021Text-based NP EnrichmentJun 2, 2024Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine TranslationJan 19, 2026LLM-Generated or Human-Written? Comparing Review and Non-Review Papers on ArXivMar 7, 2023At Your Fingertips: Extracting Piano Fingering Instructions from VideosJul 28, 2022Measuring Causal Effects of Data Statistics on Language Model's `Factual' PredictionsFeb 1, 2024OLMo: Accelerating the Science of Language ModelsMay 26, 2023Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and EvaluationDec 31, 2019oLMpics -- On what Language Model Pre-training CapturesOct 11, 2020Do Language Embeddings Capture Scales?