Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published in 2013 IEEE international conference on systems, man, and cybernetics, 2013
Swarm technique for convex optimization
Recommended citation: Anand J Kulkarni, Ishan P Durugkar, Mrinal Kumar. (2013). "Cohort intelligence: a self supervised learning behavior" 2013 IEEE international conference on systems, man, and cybernetics. https://www.researchgate.net/profile/Anand_Kulkarni4/publication/262285087_Cohort_Intelligence_A_Self_Supervised_Learning_Behavior/links/00b49538dd90cb260e000000.pdf
Published in arXiv preprint arXiv:1606.04615, 2016
DQN with macro-actions
Recommended citation: Ishan P Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan. (2016). "Deep reinforcement learning with macro-actions". arXiv preprint arXiv:1606.04615. https://idurugkar.github.com/files/macro_actions.pdf
Published in arXiv preprint arXiv:1608.05983, 2016
mapping VAE decoder samples back to latent space for improved accuracy
Recommended citation: Ian Gemp, Ishan Durugkar, Mario Parente, M Darby Dyar, Sridhar Mahadevan. (2016). "Inverting Variational Autoencoders for Improved Generative Accuracy". arXiv preprint arXiv:1608.05983. https://idurugkar.github.com/files/inverting_variational_ae.pdf
Published in Twenty-Ninth IAAI Conference, 2017
Predicting nonstationarity in changing real world scenarios for off-policy evaluation
Recommended citation: Philip S Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill. (2017). "Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing". Twenty-Ninth IAAI Conference. https://idurugkar.github.com/files/predictive_off_policy_evaluation_IAAI2017.pdf
Published in International Conference on Learning Representations, 2017
Improving GAN training with multiple descriminators
Recommended citation: Ishan Durugkar, Ian Gemp, Sridhar Mahadevan. (2017). "Generative Multi-Adversarial Networks". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/GMAN_ICLR2017.pdf
Published in Deep Reinforcement Learning Symposium, NeurIPS, 2017
Constraining TD update to improve stability
Recommended citation: Ishan Durugkar, Peter Stone. (2017). "TD Learning with Constrained Gradients". Deep Reinforcement Learning Symposium, NeurIPS 2017. https://idurugkar.github.com/files/constrained_td_NeurIPS2017_workshop.pdf
Published in International Conference on Learning Representations, 2018
Using Policy Gradient to learn navigation in knowledge bases
Recommended citation: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum. (2018). "Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/MINERVA_ICLR2018.pdf
Published in ArXiv, 2019
Using multiple auxiliary tasks as soft preferences on the policy while learning
Recommended citation: Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine. (2019). "Multi-Preference Actor Critic". ArXiv Preprint 2019. https://idurugkar.github.com/files/MultiPref_ActorCritic.pdf
Published in International Conference on Machine Learning, 2020
Reducing Sampling Error in Batch Temporal Difference Learning
Recommended citation: Brahma S. Pavse, Ishan Durugkar, Josiah P. Hanna, Peter Stone. (2020). "Reducing Sampling Error in Batch Temporal Difference Learning". International Conference on Machine Learning, 2020. https://idurugkar.github.com/files/PSEC-TD.pdf
Published in International Joint Conference on Artificial Intelligence, 2020
This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task
Recommended citation: Ishan Durugkar, Elad Liebman, Peter Stone. (2020). "Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning". International Joint Conference on Artificial Intelligence, 2020. https://idurugkar.github.com/files/IJCAI20-ishand.pdf
Published in NeurIPS 2020, 2020
Using Imitation from Observation techniques to speed up transfer of policies between environments with dynamics mismatch
Recommended citation: Siddarth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone. (2020). "An Imitation from Observation Approach to Sim-to-Real Transfer". NeurIPS, 2020. https://arxiv.org/pdf/2008.01594.pdf
Published in NeurIPS 2021, 2021
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
Recommended citation: Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone. (2021). "Adversarial Intrinsic Motivation for Reinforcement Learning". NeurIPS, 2021. https://arxiv.org/abs/2105.13345
Published in Deep RL Workshop, NeurIPS 2021, 2021
Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process
Recommended citation: Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih. (2021). "Wasserstein Distance Maximizing Intrinsic Control". Deep RL workshop, NeurIPS, 2021. https://arxiv.org/abs/2110.15331
Published in AAAI 2023, 2023
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
Recommended citation: Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone. (2023). "DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching". AAAI, 2023. https://arxiv.org/abs/2206.00233