Efficient coding and the evolution of semantic systems
Description
The forces that govern how languages assign meanings to words have been debated for decades. Recently, it has been suggested that human semantic systems are adapted for efficient communication. However, a major question has been left largely unaddressed: how does pressure for efficiency relate to language evolution?
In this talk, I will address this open question by grounding the notion of efficiency in a general information-theoretic principle, the Information Bottleneck (IB) principle. Specifically, I will present the hypothesis that languages efficiently encode meanings into words by optimizing the IB tradeoff between the complexity and accuracy of the lexicon. In support of this hypothesis, I will first show that color naming across languages is near-optimally efficient in the IB sense. Furthermore, this finding suggests (1) a theoretical explanation for why inconsistent naming and stochastic categories may be efficient; and (2) that languages may evolve under pressure for efficiency, through an annealing-like process that synthesizes continuous and discrete aspects of previous accounts of color category evolution. This process generates quantitative predictions for how color naming systems may change over time. These predictions are directly supported by an analysis of recent data documenting changes over time in the color naming system of a single language. Finally, I will show that this information-theoretic account generalizes to two qualitatively different semantic domains: names for household containers and animal taxonomies. Taken together, these results suggest that efficient coding — a general principle that also applies to lower-level neural representations — may explain to a large extent the structure and evolution of semantic representations across languages.
Speaker Bio
I’m a PhD candidate at the Center for Brain Sciences at the Hebrew University, advised by Naftali Tishby, and a visiting graduate student at UC Berkeley, hosted by Terry Regier. My research aims to understand language and cognition from first principles, building on ideas and methods from machine learning and information theory. I’m particularly interested in computational principles that can account for the ability to maintain efficient semantic representations for learning and communication in complex environments. I believe that such principles could advance our understanding of human cognition and guide the development of human-like artificial intelligence.