AAAI 2023 · 2023
Decentralized MARL approach using distribution matching for cooperative multi-agent coordination
Deep RL Workshop, NeurIPS 2021 · 2021
Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process
NeurIPS 2021 · 2021
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
NeurIPS 2020 · 2020
Using Imitation from Observation techniques to speed up transfer of policies between environments with dynamics mismatch
IJCAI 2020 · 2020
A mixing scheme that balances individual agent preferences with shared objectives and studies the subsequent learning behavior
ICML 2020 · 2020
Reducing the error due to sampling in batch TD learning
ICLR 2018 · 2018
Using policy gradient to learn navigation in knowledge bases
Deep RL Symposium, NeurIPS 2017 · 2017
Constraining TD updates to improve stability
ICLR 2017 · 2017
Improving GAN training stability and quality with multiple discriminators
IAAI 2017 · 2017
Predicting nonstationarity in changing real-world scenarios for off-policy evaluation
arXiv 2016 · 2016
Mapping VAE decoder samples back to latent space for improved generative accuracy
arXiv 2016 · 2016
Extending DQN with temporally extended macro-actions for improved exploration and learning
IEEE SMC 2013 · 2013
Swarm-inspired technique for convex optimization using cohort-based self-supervised learning