When we “know the meaning” of a word, what kind of knowledge do we have?
Understanding words seems to require both linguistic knowledge (stored form-meaning pairings and ways to combine them) and world knowledge (object properties, plausibility of events, etc.). In this talk, I will pose some challenges for common distinctions between these knowledge sources. First, I will ask whether rich information about concrete objects could be, in principle, learned from just the co-occurrence statistics of different words even in the absence of non-linguistic (e.g., perceptual) information. To this end, I will introduce a domain-general approach for leveraging such statistics (as captured by distributional semantic models, DSMs) to recover context-specific human judgments such that, e.g., “dolphin” and “alligator” appear relatively similar when considering size or habitat, but different when considering aggressiveness. Second, I will probe DSMs for “syntactic”, abstract compositional knowledge of verb-argument structure (e.g., “eat”, but not “devour”, can appear without an object). I will demonstrate that these syntactic properties of verbs can often be predicted from distributional information (i.e., without explicit access to “syntax”), indicating that DSMs capture those aspects of verb meaning that correlate with verb syntax. Nevertheless, only a small fraction of distributional information is needed for predicting verb argument structure - the rest appears to capture semantic properties that are relatively divorced from syntax. In fact, the overall similarity structure across verbs in a DSM is independent from the similarity structure across verbs as determined by their syntax, and both kinds of similarity are needed for explaining human judgments. Together, these two studies attempt to push against the upper bound on the potential complexity of distributional word meanings.