Showing 1–20 of 35 results
/ Date/ Name
May 4, 2020Exploring Content Selection in Summarization of Novel ChaptersApr 21, 2022Spurious Correlations in Reference-Free Evaluation of Text GenerationDec 21, 2022Contrastive Error Attribution for Finetuned Language ModelsOct 7, 2020WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive SummarizationAug 31, 2021Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive SummarizationApr 11, 2025SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language ModelingMay 25, 2022ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech DetectionNov 16, 2022Holistic Evaluation of Language ModelsApr 16, 2021Segmenting Subtitles for Correcting ASR Segmentation ErrorsJan 31, 2023Benchmarking Large Language Models for News SummarizationFeb 2, 2021The GEM Benchmark: Natural Language Generation, its Evaluation and MetricsJul 9, 2024STORYSUMM: Evaluating Faithfulness in Story SummarizationNov 9, 2022Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content SelectionMay 28, 2023Generating EDU Extracts for Plan-Guided Summary Re-RankingApr 6, 2020The Role of Pragmatic and Discourse Context in Determining Argument ImpactOct 27, 2020To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence TaggingAug 16, 2021On the Opportunities and Risks of Foundation ModelsMar 28, 2025L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program ExecutionMar 30, 2023Whose Opinions Do Language Models Reflect?Nov 10, 2022CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing