Learning compositional policies via hierarchical clustering
Description
Zoom link: https://mit.zoom.us/j/96471630432
Artificial agents have demonstrated tremendous success on a wide variety of tasks, at times even surpassing human performance. Yet they still show limited ability to generalise beyond the narrow settings in which they were trained, even when the new contexts are meaningfully similar to the old ones. Indeed, one of the hallmarks of natural intelligence is our ability to rapidly solve new tasks by leveraging prior knowledge. One way humans accomplish this is by abstracting out a latent structure from the task that can then be transferred over to new contexts. Moreover, if the task has a compositional structure, humans will flexibly recombine familiar structural components in novel ways to quickly solve the new task. For instance, a musician learning the piano will recognise that fingerings are independent of songs. And when learning to play another keyboard instrument like the organ or harpsichord, she can readily transfer her knowledge of piano fingerings over while learning songs specific to the new instrument. However, the extent to which task components should be learnt independently or treated as a joint unit depends on task statistics. Should components be highly correlated with each other, it may be more advantageous to learn them as a single joint unit, but this comes at the expense of greater transferability. Previous work has explored the two extremes where components are either learnt jointly as a single unit or separately as independent components. Drawing on methods from hierarchical non-parametric Bayesian inference, here, we present work where the agent is able to learn about each component separately but also about the variety of correlational structures that may bind them together. We show that this agent is highly expressive and can readily adapt to a broad range of environment statistics with varying degrees of jointness or independence between structural components.
Speaker Bio
Rex Liu is a postdoc working under the supervision of Michael Frank and Thomas Serre at Brown University. He is interested, broadly speaking, in trying to bridge the gap between natural and artificial intelligence. His specific interests are in model-based reinforcement learning, including how we are able to learn latent structures and hierarchical abstractions to support efficient planning and transfer in rich, high dimensional environments. Before joining Brown, he was a postdoc with Alex Pouget at the University of Geneva, working on quantifying information-limiting noise correlations. He obtained both his BA in Natural Sciences and his PhD in Applied Mathematics and Theoretical Physics from the University of Cambridge, UK, where he pursued research in theoretical gravitation and cosmology.