Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Future Blog Post

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

publications

Cohort intelligence: a self supervised learning behavior

Published in 2013 IEEE international conference on systems, man, and cybernetics, 2013

Swarm technique for convex optimization

Recommended citation: Anand J Kulkarni, Ishan P Durugkar, Mrinal Kumar. (2013). "Cohort intelligence: a self supervised learning behavior" 2013 IEEE international conference on systems, man, and cybernetics. https://www.researchgate.net/profile/Anand_Kulkarni4/publication/262285087_Cohort_Intelligence_A_Self_Supervised_Learning_Behavior/links/00b49538dd90cb260e000000.pdf

Deep reinforcement learning with macro-actions

Published in arXiv preprint arXiv:1606.04615, 2016

DQN with macro-actions

Recommended citation: Ishan P Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan. (2016). "Deep reinforcement learning with macro-actions". arXiv preprint arXiv:1606.04615. https://idurugkar.github.com/files/macro_actions.pdf

Inverting Variational Autoencoders for Improved Generative Accuracy

Published in arXiv preprint arXiv:1608.05983, 2016

mapping VAE decoder samples back to latent space for improved accuracy

Recommended citation: Ian Gemp, Ishan Durugkar, Mario Parente, M Darby Dyar, Sridhar Mahadevan. (2016). "Inverting Variational Autoencoders for Improved Generative Accuracy". arXiv preprint arXiv:1608.05983. https://idurugkar.github.com/files/inverting_variational_ae.pdf

Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing

Published in Twenty-Ninth IAAI Conference, 2017

Predicting nonstationarity in changing real world scenarios for off-policy evaluation

Recommended citation: Philip S Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill. (2017). "Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing". Twenty-Ninth IAAI Conference. https://idurugkar.github.com/files/predictive_off_policy_evaluation_IAAI2017.pdf

Generative Multi-Adversarial Networks

Published in International Conference on Learning Representations, 2017

Improving GAN training with multiple descriminators

Recommended citation: Ishan Durugkar, Ian Gemp, Sridhar Mahadevan. (2017). "Generative Multi-Adversarial Networks". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/GMAN_ICLR2017.pdf

TD Learning with Constrained Gradients

Published in Deep Reinforcement Learning Symposium, NeurIPS, 2017

Constraining TD update to improve stability

Recommended citation: Ishan Durugkar, Peter Stone. (2017). "TD Learning with Constrained Gradients". Deep Reinforcement Learning Symposium, NeurIPS 2017. https://idurugkar.github.com/files/constrained_td_NeurIPS2017_workshop.pdf

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

Published in International Conference on Learning Representations, 2018

Using Policy Gradient to learn navigation in knowledge bases

Recommended citation: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum. (2018). "Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/MINERVA_ICLR2018.pdf

Multi-Preference Actor Critic

Published in ArXiv, 2019

Using multiple auxiliary tasks as soft preferences on the policy while learning

Recommended citation: Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine. (2019). "Multi-Preference Actor Critic". ArXiv Preprint 2019. https://idurugkar.github.com/files/MultiPref_ActorCritic.pdf

Reducing Sampling Error in Batch Temporal Difference Learning

Published in International Conference on Machine Learning, 2020

Reducing Sampling Error in Batch Temporal Difference Learning

Recommended citation: Brahma S. Pavse, Ishan Durugkar, Josiah P. Hanna, Peter Stone. (2020). "Reducing Sampling Error in Batch Temporal Difference Learning". International Conference on Machine Learning, 2020. https://idurugkar.github.com/files/PSEC-TD.pdf

Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning

Published in International Joint Conference on Artificial Intelligence, 2020

This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task

Recommended citation: Ishan Durugkar, Elad Liebman, Peter Stone. (2020). "Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning". International Joint Conference on Artificial Intelligence, 2020. https://idurugkar.github.com/files/IJCAI20-ishand.pdf

An Imitation from Observation Approach to Sim-to-Real Transfer

Published in NeurIPS 2020, 2020

Using Imitation from Observation techniques to speed up transfer of policies between environments with dynamics mismatch

Recommended citation: Siddarth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone. (2020). "An Imitation from Observation Approach to Sim-to-Real Transfer". NeurIPS, 2020. https://arxiv.org/pdf/2008.01594.pdf

Adversarial Intrinsic Motivation for Reinforcement Learning

Published in NeurIPS 2021, 2021

Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy

Recommended citation: Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone. (2021). "Adversarial Intrinsic Motivation for Reinforcement Learning". NeurIPS, 2021. https://arxiv.org/abs/2105.13345

Wasserstein Distance Maximizing Intrinsic Control

Published in Deep RL Workshop, NeurIPS 2021, 2021

Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process

Recommended citation: Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih. (2021). "Wasserstein Distance Maximizing Intrinsic Control". Deep RL workshop, NeurIPS, 2021. https://arxiv.org/abs/2110.15331

DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching

Published in AAAI 2023, 2023

Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy

Recommended citation: Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone. (2023). "DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching". AAAI, 2023. https://arxiv.org/abs/2206.00233

Ishan Durugkar

Sitemap

Pages

Posts

publications