About
Dr. Jim DiCarlo is the Peter de Florez Professor of Neuroscience in the Department of Brain and Cognitive Sciences, Director of the MIT Quest for Intelligence and an Investigator at the McGovern Institute for Brain Research at MIT. He served as BCS department head from 2012 to 2021. earned his Ph.D. in biomedical engineering and M.D. from The Johns Hopkins University in 1998, and did his postdoctoral training in primate visual neurophysiology at Baylor College of Medicine. He joined the MIT faculty in 2002 and was awarded tenure in 2009. He is an Alfred Sloan Fellow, a Pew Scholar in the Biomedical Sciences, and a McKnight Scholar in Neuroscience. His group is currently using a combination of large-scale neurophysiology, brain imaging, optogenetic methods, and high-throughput computational simulations to understand the neuronal mechanisms and fundamental cortical computations that underlie the construction of these powerful image representations. They aim to use this understanding to inspire and develop new machine vision systems, to provide a basis for new neural prosthetics (brain-machine interfaces) to restore or augment lost senses, and to provide a foundation upon which the community can understand how high-level visual representation is altered in human conditions such as agnosia, autism, and dyslexia.
Research
Neuronal mechanisms underlying object recognition
The research goal of our laboratory is to understand the mechanisms underlying visual object recognition. Specifically we seek to understand how sensory input is transformed by the brain from an initial representation (essentially a photograph on the retina), to a new, remarkably powerful form of representation -- one that can support our seemingly effortless ability to solve the computationally difficult problem of object recognition. We are particularly focused on patterns of neuronal activity in the highest levels of the ventral visual stream (primate inferior temporal cortex, IT) that likely directly underlie recognition. At these high levels, individual neurons can have the remarkable response property of being highly selective for object identity, even though each object's image on the retinal surface is highly variable -- for example, due to changes in object position, distance, pose, lighting and background clutter. Understanding the creation of such neuronal responses by transformations carried out along the ventral visual processing stream is the key to understanding visual recognition. To approach these very difficult problems, the work of our laboratory is directed along three main lines: 1) characterize the computational usefulness of patterns of IT neuronal activity for supporting immediate visual object recognition, 2) test and develop computational theories of how visual input is transformed along the ventral processing stream from a pixel-wise representation, to a powerful representation in IT, 3) understand the spatial organization of this representation. Our primary research approaches are: neurophysiology in awake, behaving non-human primates, functional brain imaging (fMRI), human psychophysics, and computational modeling. Across all of these endeavors we aim to develop innovative methods and tools to facilitate this work in our laboratory and others. Our approaches are often synergistic with those of other MIT laboratories, and this has greatly enhanced our progress. Because recognition is critical to so much of behavior, the understanding we seek will fundamentally influence the way we think about how the brain processes sensory information, and, more generally, principles of cortical information processing. Our goal is to use this understanding to inspire artificial vision systems, to aid the development of visual prosthetics, to provide guidance to molecular approaches to repair lost brain function, and to obtain deep insight into how the brain represents sensory information in a way that is highly suited for cognition and action.
Teaching
9.02 Systems neuroscience laboratory
9.17 Systems neuroscience laboratory
Publications
* indicates papers arising from a supervised PhD thesis
* Cadieu CF, Hong H, Yamins DLK, Pinto N, Ardilla D, Solomon EA, Majaj NJ, DiCarlo JJ. Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition. PLoS Comput. Biol. 2014;10(12):e1003963. PMCID: PMC4270441
* Yamins D, Hong H, Cadieu C, Soloman E, Siebert D and DiCarlo JJ. Performance-Optimized Hierarchical Models Predict Neural Responses in Higher Visual Cortex. PNAS (2014) PMCID: PMC4060707
* Yamins DL, Hong H, Cadieu C, and DiCarlo JJ. Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream. Neural Information Processing Systems (2013).
Issa EB, Papanastassiou AM, DiCarlo JJ. Large-scale, high-resolution neurophysiological maps underlying FMRI of macaque temporal lobe. Journal of Neuroscience 33(38): 15207-19 (2013). PMCID: PMC3776064
Baldassi C, Alemi-Neissi A, Pagan M, DiCarlo JJ, Zecchina R, Zoccolan D. Shape similarity, better than semantic membership, accounts for the structure of visual object representations in a population of monkey inferotemporal neurons. PLoS Computational Biology 9(8): e1003167 (2013). PMCID: PMC3738466
* Cadieu CF, Hong H, Yamins DL, Pinto N, Majaj NJ, DiCarlo JJ. The Neural Representation Benchmark and its Evaluation on Brain and Machine. In: International Conference on Learning Representations (ICLR). Scottsdale, AZ; (2013)
Rust N and DiCarlo JJ. Balanced Increases in Selectivity and Tolerance Produce Constant Sparseness along the Ventral Visual Stream. Journal of Neuroscience 32(30): 10170-10182 (2012). PMCID: PMC3485085
Issa EB, DiCarlo JJ. Precedence of the eye region in neural processing of faces. Journal of Neuroscience 32(47: 16666-82 (2012). PMCID: PMC3542390
* Li N, DiCarlo JJ. Neuronal learning of invariant object representation in the ventral visual stream is not dependent on reward. Journal of Neuroscience 32(19): 6611-20 (2012). PMCID: PMC3367428
DiCarlo JJ, Zoccolan DD, and Rust N. How does the ventral visual stream solve object recognition? Refereed Perspective in Neuron 73(3): 415-34 (2012). PMCID: PMC3306444
Majaj N, Hong H, Solomon E, and DiCarlo JJ. A unified neuronal population code fully explains human Page "8 of "18 object recognition. Accepted for oral presentation (top 3% of papers); Computation and Systems Neuroscience (COSYNE), Salt Lake City, UT (2012).
* Pinto N, Barhomi Y, Cox DD, and DiCarlo JJ. Comparing State-of-the-Art Visual Features on Invariant Object Recognition Tasks. IEEE Workshop on Applications of Computer Vision, Kona, HI (2011).
Rust N and DiCarlo JJ. Selectivity and tolerance ("invariance") both increase as visual information propagates from cortical area V4 to IT. Journal of Neuroscience 30: 12978 - 12995 (2010). PMCID: PMC2975390
* Li N and DiCarlo JJ. Unsupervised Natural Visual Experience Rapidly Reshapes Size-Invariant Object Representation in Inferior Temporal Cortex. Neuron 67(6): 1062 - 1075 (2010). PMCID: PMC2946943
Pinto N, Doukhan D, DiCarlo JJ, Cox DD. A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation. PLoS Comput Biol 5(11): e1000579. doi:10.1371/journal.pcbi.1000579 (2009)
Li N, Cox DD, Zoccolan D, DiCarlo JJ. What response properties do individual neurons need to underlie position and clutter "invariant" object recognition? J Neurophysiol. 2009 Jul;102(1):360-76.
Zoccolan D, Oertelt N, DiCarlo JJ, Cox DD. A rodent model for the study of invariant visual object recognition. Proc Natl Acad Sci U S A. 2009 May 26;106(21):8748-53.
Li N, DiCarlo JJ. Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science. 2008 Sep 12;321(5895):1502-7.