Dhruva Tirumala: Using behavior priors for data efficiency in Reinforcement Learning

Abstract

While reinforcement learning has shown great promise as a viable solution to many problems such as Go and Atari, it’s application to challenges like continuous control has been arguably less successful. Control and robotics impose real world constraints in terms of data sparsity and exploration in high dimensional continuous action spaces is challenging. Faced with this, methods that allow us to inject prior knowledge about the structure of the world become increasingly important. In this talk, I will discuss how one such method that we dub behavior priors - probabilistic models that capture common movement and interaction patterns which are shared across a set of related tasks or contexts. For example the day-to-day behavior of humans comprises distinctive locomotion and manipulation patterns that recur across many different situations and goals. I will discuss how these models can be integrated into reinforcement learning schemes to facilitate multi-task and transfer learning. I will then extend these ideas to latent variable models and consider a formulation to learn hierarchical priors that capture different aspects of the behavior in reusable modules. I will discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) thereby offering an alternative perspective on existing ideas. Finally I will discuss some exciting prospects and potential directions to extend this line of research.

Date
February 11, 2021 16:00 — 17:00
Event
SML Seminar
Location
online

Bio

Dhruva Tirumala obtained his Master's degree in Electrical and Computer Engineering (ECE) at Carnegie Mellon University (CMU) in 2016. After this, he joined DeepMInd in London as a research engineer in 2016 and joined the UCL-DeepMind PhD program in 2019. He is co-advised by Dr. Nicolas Heess at DeepMind and Prof. Danail Stoyanov at UCL. His research focuses on applying reinforcement learning techniques to problems in control. Specifically, his interests lie in incorporating data or other forms of prior knowledge within the reinforcement learning framework to solve real world problems.
Avatar
Marc Deisenroth
DeepMind Chair in Artificial Intelligence