Tiwalayo Eisape Thesis Defense: Language Comprehension, Production, and Reasoning in Humans and Neural Language Models
Description
Date/Time: Thursday, August 14th at 10AM
In-person Location: Singleton Auditorium, 46-3002
On Zoom: https://mit.zoom.us/j/95922088823?pwd=19SutL9gXt6DbLsbD0krvfj4WdEnqJ.1
Dissertation Defense Title: Language Comprehension, Production, and Reasoning in Humans and Neural Language Models
Abstract:
How closely do neural language models mirror human language processing, and what can this alignment teach us about cognition? This dissertation presents convergent evidence in comprehension, production, and reasoning that neural language models (LMs) can serve as productive instruments for understanding naturalistic human language use at scale.
Studies 1-2 examine comprehension with complementary methods. First, Cloze Distillation—a novel method for aligning models with human next-word predictions—improves both language modeling and reading time prediction, demonstrating that LMs and humans make distinct, complementary predictions. Second, new methods for identifying syntactic information in LM hidden states demonstrate that models learn to implicitly represent incremental syntactic state. These probes also enable targeted interventions, allowing us to manipulate representations to resolve (or induce) temporary misinterpretations, confirming mechanistic understanding. While these studies demonstrate prediction's role in comprehension, a complete account requires examining whether these mechanisms also shape how humans produce language in real-time. Study 3 analyzes a massive corpus of 2.3 million competitive typing events from TypeRacer.com, uncovering the first evidence of in-context predictability effects in this domain of production. Finally, Study 4 compares human and LM reasoning systematically---LMs achieve higher syllogistic reasoning accuracy than humans while still replicating several fine-grained human-like error patterns that are orthogonal to logical accuracy, including premise ordering effects.
These converging findings reveal prediction as a fundamental mechanism in comprehension, production, and reasoning in both humans and LMs. While models achieve this through statistical learning rather than specialized cognitive architecture—often outperforming humans yet replicating their systematic biases—this alignment supports predictive processing theories of cognition. This work establishes LMs as scalable cognitive laboratories that can complement traditional experiments, and contributes psycholinguistically principled methods for understanding and controlling LMs.