
Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations
Description
Current state-of-the-art object recognition models are largely based on convolutional neural network (CNN) architectures, which are loosely inspired by the primate visual system. However, these CNNs can be fooled by imperceptibly small, explicitly crafted perturbations, and struggle to recognize objects in corrupted images that are easily recognized by humans. Here, by making comparisons with primate neural data, we first observed that CNN models with a neural hidden layer that better matches primate primary visual cortex (V1) are also more robust to adversarial attacks. Inspired by this observation, we developed VOneNets, a new class of hybrid CNN vision models. Each VOneNet contains a fixed weight neural network front-end that simulates primate V1, called the VOneBlock, followed by a neural network back-end adapted from current CNN vision models. The VOneBlock is based on a classical neuroscientific model of V1: the linear-nonlinear-Poisson model, consisting of a biologically-constrained Gabor filter bank, simple and complex cell nonlinearities, and a V1 neuronal stochasticity generator. After training, VOneNets retain high ImageNet performance, but each is substantially more robust, outperforming the base CNNs and state-of-the-art methods by 18% and 3%, respectively, on a conglomerate benchmark of perturbations comprised of white box adversarial attacks and common image corruptions. Finally, we show that all components of the VOneBlock work in synergy to improve robustness. While current CNN architectures are arguably brain-inspired, the results presented here demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in ImageNet-level computer vision applications.
Link to join Zoom webinar: https://mit.zoom.us/j/99647945989
Speaker Bio
How does hierarchical processing in neuronal networks in the brain give rise to sensory perception, and can we use this understanding to develop more human-like computer vision algorithms? Answering these questions has been the focus of my research during the past years. I first encountered the problem of visual perception during my PhD at Champalimaud Research, where I studied visual cortical processing in the mouse. Under the supervision of Leopoldo Petreanu, I developed a head-fixed motion discrimination task for mice and established a causal link between activity in the primary visual cortex (V1) and motion perception. Following that project, I studied the functional organization of cortical feedback and showed that feedback inputs in V1 relay contextual information to matching retinotopic regions in a highly organized manner. In 2019, I joined the lab of Prof. James DiCarlo at MIT to continue my training. My current research consists on using artificial neural networks (ANNs) to study primate object recognition behavior. I have continued to focus on early visual processing and implemented a set of benchmarks to evaluate how well different ANNs match primate V1 at the single neuron level. More recently, I have started to develop new computer vision models constrained by neurobiological data that are more robust to image perturbations.
Additional Info
Upcoming Cog Lunch Talks:
June 30, 2020 - OPEN
July 7, 2020 - Arturo Deza
July 14, 2020 - Ashley Thomas
July 21, 2020 - Brandon Davis
July 28, 2020 - Christopher Kelly
August 4, 2020 - Stephan Meylan
August 11, 2020 - OPEN
August 18, 2020 - OPEN