Recent & Upcoming Talks

Dhruva Tirumala: Using behavior priors for data efficiency in Reinforcement Learning

While reinforcement learning has shown great promise as a viable solution to many problems such as Go and Atari, it’s application to challenges like continuous control has been arguably less successful. Control and robotics impose real world constraints in terms of data sparsity and exploration in high dimensional continuous action spaces is challenging. Faced with this, methods that allow us to inject prior knowledge about the structure of the world become increasingly important. In this talk, I will discuss how one such method that we dub behavior priors - probabilistic models that capture common movement and interaction patterns which are shared across a set of related tasks or contexts. For example the day-to-day behavior of humans comprises distinctive locomotion and manipulation patterns that recur across many different situations and goals. I will discuss how these models can be integrated into reinforcement learning schemes to facilitate multi-task and transfer learning. I will then extend these ideas to latent variable models and consider a formulation to learn hierarchical priors that capture different aspects of the behavior in reusable modules. I will discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) thereby offering an alternative perspective on existing ideas. Finally I will discuss some exciting prospects and potential directions to extend this line of research.