Learning to act with objects, relations and physics
Description
Humans display an unparalleled degree of control over their environments. From a young age, people are able to represent the world in ways that allow them to not just make inferences about how the world works, but also to act and intervene on the world in order to accomplish their goals. This has been challenging to replicate in machines -- while children might be able to learn an actionable concept like “catapulting’’ from just a single demonstration or a few trials of experience, it might take a machine hundreds, thousands, or even millions of attempts to grasp such a skill. What can account for such glaring differences in flexibility and efficiency between human and machine action?
Across two environments introduced in this thesis, the Virtual Tools game and the Gluing Task, we propose that structured action spaces and mental simulation are crucial for explaining these differences between human and machine action, and test this proposal through a suite of computational models and behavioral experiments. In the Gluing Task, we imbue unstructured machine learning agents with relational action spaces to substantially improve their ability to generalize to new problems and even surpass human performance given enough experience. In the Virtual Tools game, we introduce a more general domain to study the quintessential human ability to use tools to solve physical problems such as “catapulting”, “tipping”, “supporting” or “launching” objects in a scene to accomplish a particular goal. We devise the “Sample, Simulate, Update” model which combines structured action spaces with mental simulation and rapid strategy updating to capture not just the flexibility of human tool use, but also the efficiency with which people can arrive at solutions to particular problems from just a handful of attempts. To capture how people learn rapidly across problems with different objects and tools, we extend “Sample, Simulate, Update” to learn strategies represented as relational programs which guide how mental simulation is used. This model can account for both people’s successes and failures to generalize across problems, and predicts how people should compose physical knowledge such as mass with learned strategies in new situations. Finally, we show how even more abstract “meta-strategies” can be learned from everyday embodied experience which change the ways in which people use mental simulations to solve problems. Individuals born with one hand instead of two spend significantly more time thinking and less time acting when faced with a new physical puzzle, perhaps reflecting a higher cost of action learned from their everyday experience. We conclude by discussing the implications of this work for tool cognition more broadly, from tool synthesis to cumulative technological culture, as well as how these results might inform future work in artificial intelligence, machine learning and robotics.
zoom link : https://mit.zoom.us/j/98075895390?pwd=VWUvOGFZdWZYbUVQdnFLQmJsZnFwUT09
Password if prompted: 985483