Showing 1–20 of 51 results
/ Date/ Name
May 9, 2018Automatic Article Commenting: the Task and DatasetFeb 23, 2022COLD Decoding: Energy-based Constrained Text Generation with Langevin DynamicsSep 9, 2019Counterfactual Story Reasoning and GenerationAug 15, 2024Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning BehaviorsDec 15, 2021Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous PromptsJun 8, 2021TIMEDIAL: Temporal Commonsense Reasoning in DialogDec 16, 2021NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead HeuristicsOct 12, 2020Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense ReasoningApr 4, 2024Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 UltraJun 6, 2019Conversing by Reading: Contentful Neural Conversation with On-demand Machine ReadingSep 4, 2018Texar: A Modularized, Versatile, and Extensible Toolkit for Text GenerationMar 14, 2022Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph ExpertsJun 17, 2024WeatherQA: Can Multimodal Language Models Reason about Severe Weather?Nov 16, 2023MacGyver: Are Large Language Models Creative Problem Solvers?May 24, 2022Maieutic Prompting: Logically Consistent Reasoning with Recursive ExplanationsNov 10, 2019Social Bias Frames: Reasoning about Social and Power Implications of LanguageNov 16, 2023Structured Chemistry Reasoning with Large Language ModelsJul 1, 2025Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMsDec 22, 2025DeliveryBench: Can Agents Earn Profit in Real World?Jun 9, 2024Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples