Publications

Wasserstein Distance Maximizing Intrinsic Control

Published in Deep RL Workshop, NeurIPS 2021, 2021

Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process

Recommended citation: Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih. (2021). "Wasserstein Distance Maximizing Intrinsic Control". Deep RL workshop, NeurIPS, 2021. https://arxiv.org/abs/2110.15331

Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning

Published in International Joint Conference on Artificial Intelligence, 2020

This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task

Recommended citation: Ishan Durugkar, Elad Liebman, Peter Stone. (2020). "Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning". International Joint Conference on Artificial Intelligence, 2020. https://idurugkar.github.com/files/IJCAI20-ishand.pdf

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

Published in International Conference on Learning Representations, 2018

Using Policy Gradient to learn navigation in knowledge bases

Recommended citation: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum. (2018). "Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/MINERVA_ICLR2018.pdf

Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing

Published in Twenty-Ninth IAAI Conference, 2017

Predicting nonstationarity in changing real world scenarios for off-policy evaluation

Recommended citation: Philip S Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill. (2017). "Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing". Twenty-Ninth IAAI Conference. https://idurugkar.github.com/files/predictive_off_policy_evaluation_IAAI2017.pdf

Cohort intelligence: a self supervised learning behavior

Published in 2013 IEEE international conference on systems, man, and cybernetics, 2013

Swarm technique for convex optimization

Recommended citation: Anand J Kulkarni, Ishan P Durugkar, Mrinal Kumar. (2013). "Cohort intelligence: a self supervised learning behavior" 2013 IEEE international conference on systems, man, and cybernetics. https://www.researchgate.net/profile/Anand_Kulkarni4/publication/262285087_Cohort_Intelligence_A_Self_Supervised_Learning_Behavior/links/00b49538dd90cb260e000000.pdf