Showing 1–20 of 38 results
/ Date/ Name
Nov 13, 2020Rebounding Bandits for Modeling Satiation EffectsJun 27, 2022Supervised Learning with General Risk FunctionalsJul 1, 2021When Curation Becomes Creation: Algorithms, Microcontent, and the Vanishing Distinction between Platforms and CreatorsSep 20, 2024A Unified Causal Framework for Auditing Recommender Systems for Ethical ConcernsMar 25, 2022Modeling Attrition in Recommender Systems with Departing BanditsMar 2, 2021Median Optimal Treatment RegimesFeb 6, 2024Personalized Language Modeling from Personalized Human FeedbackMay 31, 2025Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large ModelsMar 13, 2024Prompting Fairness: Integrating Causality to Debias Large Language ModelsApr 13, 2026Filtered Reasoning Score: Evaluating Reasoning Quality on a Model's Most-Confident TracesSep 21, 2022Off-Policy Risk Assessment in Markov Decision ProcessesOct 12, 2021Action-Sufficient State Representation Learning for Control with Structural ConstraintsOct 17, 2024A Common Pitfall of Margin-based Language Model Alignment: Gradient EntanglementMay 28, 2025Learning Composable Chains-of-ThoughtDec 12, 2019Game Design for Eliciting Distinguishable BehaviorApr 18, 2021Off-Policy Risk Assessment in Contextual BanditsMar 4, 2021On the Convergence and Optimality of Policy Gradient for Markov Coherent RiskApr 22, 2022A Taxonomy of Human and ML Strengths in Decision-Making to Investigate Human-ML ComplementarityApr 16, 2023A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed BanditsJun 17, 2025AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes