Essentials: The Neuroscience of Speech, Language & Music | Dr. Erich Jarvis
Dr. Erich Jarvis argues that speech and language are not governed by a separate 'language module' but are embedded within speech production and auditory perception pathways that evolved convergently in humans, songbirds, parrots, and hummingbirds. He discusses how vocal learning is rare across species, how the brain circuits and even specific genes controlling speech are strikingly similar across these distantly related animals, and how movement, dancing, and consistent practice can help maintain cognitive and speech-related brain circuits throughout life.
Summary
Dr. Erich Jarvis opens by challenging the conventional notion of a dedicated 'language module' in the brain. He argues instead that the algorithms for spoken language are built directly into the speech production pathway — which controls the larynx and jaw — and into the auditory perception pathway. These are not separate from some overarching language center. The speech production pathway is specialized in humans, parrots, and songbirds, while the auditory pathway is more broadly distributed across the animal kingdom, which is why dogs can understand hundreds of spoken words but cannot produce any.
Jarvis explains that the brain regions controlling hand gesturing are directly adjacent to those controlling speech, suggesting an evolutionary relationship. He proposes that speech pathways evolved out of body movement pathways, which is why people gesture with their hands even when speaking on the phone where no one can see them. He contrasts this with species like great apes, which have motor pathways capable of rudimentary gestural language (as demonstrated by Koko the gorilla learning sign language) but lack the forebrain-to-brainstem circuitry needed for vocal learning.
On the topic of vocal learning, Jarvis distinguishes between innate vocalizations — present in most vertebrates and controlled by brainstem circuits — and learned vocalizations, which are rare and require forebrain circuits. Only humans, parrots, songbirds, hummingbirds, and a handful of other species have evolved forebrain control over the brainstem vocal machinery, allowing for imitation and learning of sounds. He notes that hummingbirds even coordinate wing-flapping sounds with their vocalizations rhythmically.
Jarvis addresses the evolutionary timeline of language, arguing that genomic evidence from Neanderthal and Denisovan fossils suggests that genes involved in speech circuits — including FOXP2 — are shared with these ancient hominids, leading him to conclude that spoken language has existed for at least 500,000 to one million years. He also highlights remarkable convergent evolution: despite humans and songbirds sharing a common ancestor 300 million years ago, the brain circuits, connectivity patterns, and even specific gene expression profiles in their speech and song regions are strikingly similar. Mutations in these shared genes produce analogous speech deficits across species.
The conversation covers critical periods for language learning, with Jarvis explaining that childhood is when the brain is most plastic across all domains — not just language. He argues that children who learn multiple languages don't maintain greater general plasticity into adulthood, but rather retain a broader repertoire of phonemes that makes learning additional languages easier. The brain narrows its phonemic range during development based on the languages encountered.
Jarvis distinguishes between semantic communication (meaning-based) and affective communication (emotion-based), noting that both use similar speech and song circuits but differ in lateralization — the left hemisphere is dominant for speech while the right is more active for singing and musical processing. He hypothesizes that spoken language evolved first for emotional, song-like communication (mate attraction, territory defense) before being co-opted for abstract semantic communication.
On stuttering, Jarvis describes how his lab accidentally discovered stuttering in songbirds after basal ganglia damage, and how the birds recovered through neurogenesis — a capacity human brains largely lack. He links disruption of the basal ganglia's speech circuits to both acquired and developmental stuttering in humans, and notes that sensory-motor integration therapies can help reduce stuttering.
Jarvis also addresses written language, explaining that reading silently still activates the speech production pathway — the brain 'speaks' what it reads — and that writing requires coordinating at least four brain circuits: visual, speech production, speech perception, and hand motor areas. On texting and digital communication, he argues it is not degrading language but repurposing brain circuits, potentially enlarging thumb motor representations while reducing use of richer linguistic expression.
Finally, Jarvis advocates for physical movement — particularly dance — as a way to keep speech and cognitive circuits healthy, arguing that because speech pathways are adjacent to and evolutionarily linked with body movement pathways, sustained physical activity supports cognitive and linguistic function into old age.
Key Insights
- Jarvis argues there is no good evidence for a separate 'language module' in the brain; instead, the algorithms for spoken language are embedded within the speech production and auditory perception pathways themselves.
- Jarvis claims that vocal learning — the ability to imitate and learn sounds — is extremely rare across the animal kingdom, found only in humans, parrots, songbirds, hummingbirds, and a few other species, and that this capacity requires forebrain circuits to take over brainstem vocal control.
- Jarvis argues that genomic evidence from Neanderthal and Denisovan fossils — including shared FOXP2 gene sequences — suggests spoken language has existed for at least 500,000 to one million years and that Neanderthals likely had some form of spoken language.
- Jarvis and his colleagues discovered that some genes controlling neural connectivity are actually turned off in speech circuits, which paradoxically allows new connections to form that would otherwise be repelled — a loss-of-function mutation that produces a gain of function for speech.
- Jarvis explains that children who grow up bilingual do not retain greater general brain plasticity in adulthood, but rather maintain a broader phonemic repertoire, which is the actual mechanism that makes learning additional languages easier later in life.
- Jarvis proposes that spoken language evolved first for affective, emotionally driven communication — analogous to courtship singing — and was only later co-opted for abstract semantic communication, based on the observation that all vocal learning species use learned sounds affectively but only a few use them semantically.
- Jarvis's lab accidentally discovered stuttering in songbirds after basal ganglia damage, and found that birds recovered through neurogenesis; he links this to human stuttering, where basal ganglia disruption — whether developmental or acquired — is a common underlying cause.
- Jarvis argues that consistent physical movement, including dancing, helps maintain the health of speech and cognitive brain circuits because the speech pathways are evolutionarily and anatomically adjacent to body movement pathways, making motor activity a form of indirect cognitive exercise.
Topics
Full transcript available for MurmurCast members
Sign Up to Access