Primates are able to rapidly, accurately and effortlessly perform the computationally difficult visual task of invariant object recognition --- the ability to discriminate between different objects in the face of high variation in object viewing parameters and background conditions. This ability is thought to rely on the ventral visual stream, a hierarchy of visual cortical areas culminating in inferior temporal (IT) cortex. In particular, decades of research strongly suggest that the population of neurons in IT supports invariant object recognition behavior. However, direct causal evidence for this decoding hypothesis has been equivocal to date, especially beyond the specific case of face-selective sub-regions of IT. This research aims to directly test the general causal role of IT in invariant object recognition. To do so, we first characterized human and macaque monkey behavior over a large behavioral domain consisting of binary discriminations between images basic-level objects, establishing behavioral metrics and benchmarks for computational models of this behavior. This work suggests that, in the domain of basic-level core object recognition, humans and monkeys are remarkably similar in their behavioral responses, while leading models of the visual system significantly diverge from primate behavior. We then reversibly inactivated individual, millimeter-scale regions of IT via injection of muscimol while monkeys performed several interleaved binary object discrimination tasks. We found that inactivating different millimeter-scale regions of primate IT resulted in different patterns of object recognition deficits, each predicted by the local region's neuronal selectivity. Our results establish causal evidence that IT directly underlies primate object recognition behavior in a topographically organized manner. Taken together, these results establishes quantitative experimental constraints for computational models of the ventral visual stream and object recognition behavior.