Skip to main content

Main navigation

  • About BCS
    • Mission
    • History
    • Building 46
      • Building 46 Room Reservations
    • Leadership
    • Employment
    • Contact
      • BCS Spot Awards
      • Building 46 Email and Slack
    • Directory
  • Faculty + Research
    • Faculty
    • Areas of Research
    • Postdoctoral Research
      • Postdoctoral Association and Committees
    • Core Facilities
    • InBrain
      • InBRAIN Collaboration Data Sharing Policy
  • Academics
    • Course 9: Brain and Cognitive Sciences
    • Course 6-9: Computation and Cognition
      • Course 6-9 MEng
    • Brain and Cognitive Sciences PhD
      • How to Apply
      • Program Details
      • Classes
      • Research
      • Student Life
      • For Current Students
    • Molecular and Cellular Neuroscience Program
      • How to Apply to MCN
      • MCN Faculty and Research Areas
      • MCN Curriculum
      • Model Systems
      • MCN Events
      • MCN FAQ
      • MCN Contacts
    • Computationally-Enabled Integrative Neuroscience Program
    • Research Scholars Program
    • Course Offerings
  • News + Events
    • News
    • Events
    • Recordings
    • Newsletter
  • Community + Culture
    • Community + Culture
    • Community Stories
    • Outreach
      • MIT Summer Research Program (MSRP)
      • Post-Baccalaureate Research Scholars
      • Conferences, Outreach and Networking Opportunities
    • Get Involved (MIT login required)
    • Resources (MIT login Required)
  • Give to BCS
    • Join the Champions of the Brain Fellows Society
    • Meet Our Donors

Utility Menu

  • Directory
  • Apply to BCS
  • Contact Us

Footer

  • Contact Us
  • Employment
  • Be a Test Subject
  • Login

Footer 2

  • McGovern
  • Picower

Utility Menu

  • Directory
  • Apply to BCS
  • Contact Us
Brain and Cognitive Sciences
Menu
MIT

Main navigation

  • About BCS
    • Mission
    • History
    • Building 46
    • Leadership
    • Employment
    • Contact
    • Directory
  • Faculty + Research
    • Faculty
    • Areas of Research
    • Postdoctoral Research
    • Core Facilities
    • InBrain
  • Academics
    • Course 9: Brain and Cognitive Sciences
    • Course 6-9: Computation and Cognition
    • Brain and Cognitive Sciences PhD
    • Molecular and Cellular Neuroscience Program
    • Computationally-Enabled Integrative Neuroscience Program
    • Research Scholars Program
    • Course Offerings
  • News + Events
    • News
    • Events
    • Recordings
    • Newsletter
  • Community + Culture
    • Community + Culture
    • Community Stories
    • Outreach
    • Get Involved (MIT login required)
    • Resources (MIT login Required)
  • Give to BCS
    • Join the Champions of the Brain Fellows Society
    • Meet Our Donors

Events

News Menu

  • News
  • Events
  • Newsletters

Breadcrumb

  1. Home
  2. Events
  3. Sarah Schwettmann Thesis Defense: Generalizable Representations for Vision in Biological and Artificial Neural Networks
Sarah Schwettmann Thesis Defense: Generalizable Representations for Vision in Biological and Artificial Neural Networks

Sarah Schwettmann Thesis Defense: Generalizable Representations for Vision in Biological and Artificial Neural Networks

Add to CalendarAmerica/New_YorkSarah Schwettmann Thesis Defense: Generalizable Representations for Vision in Biological and Artificial Neural Networks08/30/2021 1:00 pm08/30/2021 2:00 pm
August 30, 2021
1:00 pm - 2:00 pm
Contact
jugale@mit.edu
    Description

    Zoom link: https://mit.zoom.us/j/3639522280

    Speaker: Sarah Schwettmann, Torralba & Tenenbaum Labs

    Abstract: 

    This thesis makes empirical and methodological progress toward closing the representational gap between human perception and generative models. 

    Human vision is characteristically flexible and generalizable. One of the persistent challenges of vision science is understanding the underlying representations that allow us to recognize objects and scene attributes across a diversity of environments. A central framework for identifying such representations is inverse graphics, which hypothesizes that the brain achieves robust scene understanding from image data by inverting generative models to recover their latent parameters. I demonstrate that we can directly test the biological plausibility of generative models by uncovering relevant neural representations in the human brain. For instance, if physical reasoning were to be implemented in the brain as probabilistic simulations of a mental physics engine, we would expect neural representations of physical properties like object mass to be abstract and invariant––useful as inputs to a forward model of objects and their dynamics. I present the first evidence that this is indeed the case: fMRI decoding analyses in brain regions implicated in intuitive physics reveal mass representations that generalize across variations in physical scene, material, friction, and motion energy. 

    We can describe real-world physical scene and object understanding as inverse graphics because we know how to formalize the forward graphics model in a meaningful way, e.g. a physics engine, such that vision inverts it. However, this is not the case with other attributes of visual scenes such as their style or mood, where the relationship between what is experienced and what would be considered the image data is difficult to formalize, not sufficiently explained by optical principles or models of physics that can be inverted. How do we begin to get traction on how humans experience higher-level aspects of visual scenes, or recognize and appreciate meaningful structure that may be difficult to articulate? 

    I argue that large and flexible generative models for computer vision––that learn structure entirely from data––offer a promising setting for probing computational representations of human-interpretable concepts at different levels of abstraction. Attempts to interpret deep networks have traditionally searched only for predetermined sets of concepts, limiting what representations they can discover. I introduce a more data-driven approach to the interpretation question: a framework for building shared vocabularies, represented by deep networks and salient to humans, from the ground up. I present a procedure that uses human annotations to discover an open-ended set of visual concepts, ranging from low-level features of individual objects to high-level attributes of visual scenes, in the same representational space. In a series of experiments with human participants, I show that concepts learned with this approach are reliable and freely composable: generalizing across scenes and observers, and enabling fine-grained manipulation of image style and content. Next I introduce a learned captioning model that maps patterns of neuron activation to natural language strings, making it possible to generate open-ended, compositional descriptions of neuron function. These approaches enable us to map between visual concepts in model representations and human perception, analyse models, and synthesize novel scenes that extrapolate dimensions of visual experience that are meaningful to observers. 

     

    Upcoming Events

    See All Events
    Don't miss our next newsletter!
    Sign Up

    Footer menu

    • Contact Us
    • Employment
    • Be a Test Subject
    • Login

    Footer 2

    • McGovern
    • Picower
    Brain and Cognitive Sciences

    MIT Department of Brain and Cognitive Sciences

    Massachusetts Institute of Technology

    77 Massachusetts Avenue, Room 46-2005

    Cambridge, MA 02139-4307 | (617) 253-5748

    For Emergencies | Accessibility

    Massachusetts Institute of Technology