"au:"Pierre-Luc Bacon"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Pierre-Luc Bacon"" — arXiv2 Search

Showing 1–20 of 54 results

/ Date/ Name

Oct 15, 2019Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling Oct 25, 2020XLVIN: eXecuted Latent Value Iteration Nets Mar 10, 2021An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning Sep 16, 2016The Option-Critic Architecture Jul 6, 2020TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?Feb 26, 2020Policy Evaluation Networks Jan 1, 2020Options of Interest: Temporal Abstraction with Interest Functions Jun 6, 2021Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation Dec 22, 2021Direct Behavior Specification via Constrained Reinforcement Learning Sep 26, 2020Graph neural induction of value iteration Oct 11, 2021Neural Algorithmic Reasoners are Implicit Planners Dec 3, 2016A Matrix Splitting Perspective on Planning with Options Sep 14, 2017When Waiting is not an Option : Learning Options with a Deliberation Cost Oct 23, 2023Course Correcting Koopman Representations Jun 7, 2023Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design Dec 11, 2024MaestroMotif: Skill Design from Artificial Intelligence Feedback Feb 8, 2025Mol-MoE: Training Preference-Guided Routers for Molecule Generation Jun 8, 2025State Entropy Regularization for Robust Reinforcement Learning Oct 1, 2025The Three Regimes of Offline-to-Online Reinforcement Learning May 16, 2022The Primacy Bias in Deep Reinforcement Learning