Posts by Collection


Cohort intelligence: a self supervised learning behavior

Published in 2013 IEEE international conference on systems, man, and cybernetics, 2013

Swarm technique for convex optimization

Recommended citation: Anand J Kulkarni, Ishan P Durugkar, Mrinal Kumar. (2013). "Cohort intelligence: a self supervised learning behavior" 2013 IEEE international conference on systems, man, and cybernetics.

Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing

Published in Twenty-Ninth IAAI Conference, 2017

Predicting nonstationarity in changing real world scenarios for off-policy evaluation

Recommended citation: Philip S Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill. (2017). "Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing". Twenty-Ninth IAAI Conference.

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

Published in International Conference on Learning Representations, 2018

Using Policy Gradient to learn navigation in knowledge bases

Recommended citation: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum. (2018). "Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning". International Conference on Learning Representations, 2017.

Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning

Published in International Joint Conference on Artificial Intelligence, 2020

This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task

Recommended citation: Ishan Durugkar, Elad Liebman, Peter Stone. (2020). "Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning". International Joint Conference on Artificial Intelligence, 2020.

Wasserstein Distance Maximizing Intrinsic Control

Published in Deep RL Workshop, NeurIPS 2021, 2021

Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process

Recommended citation: Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih. (2021). "Wasserstein Distance Maximizing Intrinsic Control". Deep RL workshop, NeurIPS, 2021.