arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Yaodong Yang"" — arXiv2 Search
Showing 1–6 of 6 results
/ Date
/ Name
Mar 23, 2025
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
Oct 2, 2024
Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games
Jun 20, 2024
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
Mar 1, 2024
Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games
Jun 5, 2021
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Feb 10, 2020
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning