Adversarial examples and human-ML alignment

Speaker(s)

Shibani Santurkar

July 23, 2020

5:00 pm - 6:30 pm

Location

Zoom webinar

Contact

Jenelle Feather

Description

Machine learning models today achieve impressive performance on challenging benchmark tasks. Yet, these models remain remarkably brittle---small perturbations of natural inputs, known as adversarial examples, can severely degrade their behavior.

Why is this the case?

In this tutorial, we take a closer look at this question, and demonstrate that the observed brittleness can be largely attributed to the fact that our models tend to solve classification tasks quite differently from humans. Specifically, viewing neural networks as feature extractors, we study how features extracted by neural networks may diverge from those used by humans, and how adversarially robust models can help to make progress towards bridging this gap.

Zoom meeting: https://mit.zoom.us/j/98984209390?pwd=M1Z4Mk02N0NwWEZ5Z3ZYTEo5TWlJUT09
Password: 107671

Additional tutorial info:

The tutorial will include demos---we will use Colab notebooks so please bring laptops along. In these demos, we will explore the brittleness of standard ML models by crafting adversarial perturbations, and use these as a lens to inspect the features models rely on.

Github link for demos: https://github.com/MadryLab/AdvEx_Tutorial

Suggested reading (in order of importance): Adversarial examples [https://arxiv.org/abs/1412.6572]; Training robust models [https://arxiv.org/abs/1706.06083]; ML models rely on imperceptible features [https://arxiv.org/abs/1905.02175]; Robustness as a feature prior [https://arxiv.org/abs/1805.12152, https://arxiv.org/abs/1906.00945].

Speaker Bio

Shibani Santurkar is a PhD student in the MIT EECS Department, advised by Aleksander Mądry and Nir Shavit. Her research has been focused on two broad themes: developing a precise understanding of widely-used deep learning techniques; and avenues to make machine learning methods robust and reliable. Prior to joining MIT, she received a bachelor's degree in electrical engineering from IIT Bombay. She is a recipient of the Google Fellowship in Machine Learning.

Additional Info

Upcoming computational tutorials (Thursdays from 1pm-2:30pm)

Aug 6 Noga Zaslavsky: Information bottleneck method and applications for modeling human cognition

Sep 3 Kim Scott and Maddie Pelz: Lookit platform for online behavioral studies

Sep 17 Christian Bueno: Dimensionality reduction of dynamical systems

We are always looking for more speakers to present in the series! If you would like to host a tutorial, please contact Jenelle Feather (jfeather@mit.edu) or Nhat Le (nmle@mit.edu

Upcoming Events