DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching
Published in AAAI 2023, 2023
Recommended citation: Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone. (2023). "DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching". AAAI, 2023. https://arxiv.org/abs/2206.00233
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy. Introducing the time-step metric as a way to measure the work of transporting measure in MDPs and used in estimating the Wasserstein distance.
Recommended citation: Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone. (2022). “DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching”. ArXiv 2022.