Scene understanding by bottom-up top-down visual routines

Speaker(s)

Shimon Ullman

April 8, 2021

8:00 pm - 9:00 pm

Location

Zoom

Contact

Catherine Nunziata

Host

TBA

Description

We will present a model in which meaningful understanding of scenes is obtained from the combined processing of a bottom-up (BU) and top-down (TD) streams, interacting through a bi-directional communication between them. The BU stream creates a partial visual representation in the higher level parts of the model. The model then provides a top-down instruction to the TD stream, which guides the next cycle, to extract selected information and expand the existing representation. By automatically selecting an appropriate sequence of TD instructions, the model successively extracts from the scene structures of interest in a goal-directed manner. We will show recent results of extracting complex scene structures, and the ability of the model to reach broad generalization to novel scene configurations.

Zoom link: https://mit.zoom.us/j/93324663073

Speaker Bio

Prof Ullman is the Samy and Ruth Cohn Professor of Computer Science and the Department Head, Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel; as well as an adjunct professor in the Brain and Cognitive Sciences Department, MIT.