Publications
Publications
Extending ad hoc teamwork to settings with multiple unknown agents, enabling cooperative behavior without prior coordination
A unified framework for goal-conditioned RL that generalizes policy gradient methods using f-divergences
Decentralized MARL approach using distribution matching for cooperative multi-agent coordination
A lightweight real-time object detection pipeline for robot soccer, nominated for best paper at RoboCup 2022
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
Using Imitation from Observation techniques to speed up transfer of policies between environments with dynamics mismatch
A mixing scheme that balances individual agent preferences with shared objectives and studies the subsequent learning behavior
Reducing the error due to sampling in batch TD learning
Using policy gradient to learn navigation in knowledge bases
Using deep generative models to separate spectral signals from nuisance variables in hyperspectral unmixing
Improving GAN training stability and quality with multiple discriminators
Predicting nonstationarity in changing real-world scenarios for off-policy evaluation
Swarm-inspired technique for convex optimization using cohort-based self-supervised learning
Workshops
An adversarial approach to offline imitation learning that seeks modes of the expert distribution rather than averaging over them
Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process
Analysis of sampling error sources in batch TD learning and their effect on value prediction accuracy
An actor-critic method that balances multiple reward preferences for multi-objective reinforcement learning
Constraining TD updates to improve stability
Preprints
Reformulates world modeling as a visual question answering problem, leveraging vision-language models for robotic planning with improved generalization over reconstruction-based methods
Applying sequence modeling to enable agents to quickly adapt and cooperate with N unknown teammates in ad hoc teamwork settings
Mapping VAE decoder samples back to latent space for improved generative accuracy
Extending DQN with temporally extended macro-actions for improved exploration and learning