
Cog Lunch: Stephan Meylan "Word forms reflect trade-offs between speaker effort and robust listener recognition"
Description
What traces do cognitive processes leave in the structure of natural languages? In this talk I’ll revisit one of the most robust empirical laws describing the lexicons of natural languages, Zipf's Law of Abbreviation: frequent words are likely to be short. More recent work has explored a related regularity that more frequent words have higher probability word forms (i.e., are composed of more common sounds and sound sequences, even among words of the same length). This has been argued to support communication by making commonly used words both easier to produce and understand. Here I'll propose and test an alternative theoretical interpretation, focusing on the understudied observation that languages could be composed of far higher probability word forms than is actually observed. I'll show---using a Bayesian model of word recognition---that languages must use rare sounds and sound sequences to support efficient word recognition. I'll argue that this constitutes a more general, robust, and extensible formulation of Zipf's Law of Abbreviation and sets the stage for better understanding how linguistic universals may emerge from communicative pressures.