My research is centered on Reinforcement Learning (RL) — how agents learn to act in complex environments through interaction and feedback. Below are the main directions I work on.


Visitation Distributions in RL

RL algorithms are deeply shaped by the distribution of states and transitions an agent encounters during training. My doctoral work studied how to estimate and control these visitation distributions to improve learning — from shaping exploration to correcting for off-policy bias. Key results include intrinsic motivation methods that explicitly target distributional objectives and new theoretical frameworks connecting visitation geometry to policy optimization.

Multiagent Reinforcement Learning

In settings with multiple agents, individual learning is complicated by non-stationarity, coordination challenges, and emergent social dynamics. I have worked on preference-aware multiagent settings (where agents have heterogeneous objectives), decentralized distribution-matching for cooperative MARL, and cooperative play in competitive game environments like Gran Turismo.

Sim-to-Real Transfer & Robotics

Deploying RL policies trained in simulation to real robots requires bridging the "reality gap." I have worked on adversarial techniques for transfer, grounding policy learning in real robot experience, and multi-robot coordination. My hands-on robotics work includes the UT Austin Villa team in the RoboCup Standard Platform League.

RL for Games & Sequential Decision Making

Games provide rich testbeds for RL research: they are high-dimensional, competitive or cooperative, and demand long-horizon planning. At Sony AI I work on applying RL to Gran Turismo, pushing the boundary of what learned agents can achieve in complex real-time environments. Earlier work includes policy learning for knowledge-base navigation and macro-action discovery.


See my publications page for a full list of papers, or Google Scholar for citation metrics.