Guiding Exploration through Generalization

Speaker(s)

Charley Wu, Max Planck Institute for Human Development

November 27, 2018

5:00 pm - 6:00 pm

Location

McGovern Seminar Room (46-3189)

Contact

Matthew Regan

Description

From foraging for food to learning complex games, many types of intelligent behavior can be framed as a search problem, where an individual must explore a vast set of possible actions. Under finite search horizons, optimal solutions are generally unobtainable, yet humans are faced with these problems on a daily basis. How do humans navigate vast state-spaces, where the key question is not “when” but “where” to explore? One key ingredient of human intelligence is the ability to generalize from observed to unobserved outcomes, in order to form intuitions about where exploration seems promising. Using a variety of bandit tasks with up to 121 arms, we study how humans search for rewards under limited search horizons, where the spatial correlation of rewards (in both artificial and natural environments) provides traction for generalization. Across a diverse set of probabilistic and heuristic models and using out-of-sample prediction accuracy, we find strong evidence that Gaussian Process function learning—combined with an optimistic Upper Confidence Bound sampling strategy—provides a robust model for how humans use generalization to guide search. Our modelling results and parameter estimates are highly recoverable, and can be used to simulate human-like performance, while also suggesting a systematic—yet sometimes beneficial—tendency towards undergeneralization. These results have been replicated in a number of follow-up studies, where we use this paradigm to explain developmental differences in explorative behavior in children and to model generalization across not only spatial, but also abstract, conceptual features.

Additional Info

Upcoming Cog Lunches