READING IN THE BRAIN
French scientist Stanislas Dehaene was trained as a mathematician and psychologist before becoming one of the world’s most active researchers on the cognitive neuroscience of language and number processing in the human brain. He is the director of the Cognitive Neuroimaging Unit in Saclay, France; professor of experimental cognitive psychology at the Collège de France; and a member of both the French Academy of Sciences and the pontifical Academy of Sciences. He has published extensively in peer-reviewed scientific journals and is the author of several books, including The Number Sense.
Praise for Reading in the Brain
A Washington Post Best Science Book of 2009
A Library Journal Best Sci-Tech Book of 2009
“In his splendid Reading in the Brain, French neuroscientist Stanislas Dehaene reveals how decades of low-tech experiments and high-tech brain-imaging studies have unwrapped the mystery of reading and revealed its component parts. . . . A pleasure to read. [Dehaene] never oversimplifies; he takes the time to tell the whole story, and he tells it in a literate way.”
—The Wall Street Journal
“Fascinating . . . By studying the wet stuff inside our head, we can begin to understand why this sentence has this structure, and why this letter, this one right here, has its shape. . . . Eloquent . . . Provide[s] a wealth of evidence.”
“Dehaene’s masterful book is a delight to read and scientifically precise.”
“Combining research and narrative, Dehaene weaves a fascinating explanation of how the prefrontal cortex co-opted primeval neurological pathways to learn a uniquely human skill.”
“The transparent and automatic feat of reading comprehension disguises an intricate biological effort, ably analyzed in this fascinating study. . . . This lively, lucid treatise proves once again that Dehaene is one of our most gifted expositors of science; he makes the workings of the mind less mysterious, but no less miraculous.”
“[Dehaene] is that rare bird: a scientist who can write.”
—The Globe and Mail (Toronto)
“Inspire[s] a sense of wonder at the complexity at the task readers are performing just by scanning from page to page.”
“We are fortunate that Stanislas Dehaene, the leading authority on the neuro-science of language, is also a beautiful writer. His Reading in the Brain brings together the cognitive, the cultural, and the neurological in an elegant, compelling narrative. It is a revelatory work.”
—Oliver Sacks, M.D.
“In a moment when knowledge about the reading brain may be the key to its preservation, Stanislas Dehaene’s book provides the next critical rung of that knowledge. He does this through insights gained from his own prolific research, through his comprehensive grasp of the neurosciences, and through his unique combination of common sense and wisdom that shines through every chapter.”
—Maryanne Wolf, author of Proust and the Squid: The Story and Science of the Reading Brain
“Stanislas Dehaene takes us on a journey into the science of reading. We travel past firing neurons in monkeys, brain activation patterns in humans, people with brain damage, and culture as a whole. It’s a proactive and enjoyable synthesis of a tremendous amount of information, with just the right balance between getting the facts right and making them accessible to lay readers.”
—Joseph LeDoux, University Professor, New York University, and author of Synaptic Self and The Emotional Brain
“Reading in the Brain isn’t just about reading. It comes nearer than anything I have encountered to explaining how humans think, and does so with a simple elegance that can be grasped by scientists and nonscientists alike. Dehaene provides insight about the neurological underpinnings of the spectacular cognitive skills that characterize our species. Students of human evolution are not the only ones who will find Reading in the Brain fascinating. Parents, educators, and anyone else who nurtures the intellectual development of children cannot afford to ignore Dehaene’s observations about the best methods for teaching them to read!”
—Dean Falk, author of Finding Our Tongues: Mothers, Infants, and the Origins of Language
“The complicated partnership of eye and mind that transforms printed symbols into sound, music, and meaning, and gives rise to thought, is the subject of this intriguing study. It’s a wondrous journey: like that of stout Cortez, like H. M. Stanley’s search for Dr. David Livingstone, like the next stunning probe into outer space.”
—Howard Engel, coauthor of The Man Who Forgot How to Read
Reading in the Brain
The New Science
of How We Read
The New Science of Reading
Withdrawn into the peace of this desert, along with some books, few but wise, I live in conversation with the deceased, and listen to the dead with my eyes
—FRANCISCO DE QUEVEDO
At this very moment, your brain is accomplishing an amazing feat—reading. Your eyes scan the page in short spasmodic movements. Four or five times per second, your gaze stops just long enough to recognize one or two words. You are, of course, unaware of this jerky intake of information. Only the sounds and meanings of the words reach your conscious mind. But how can a few black marks on white paper projected onto your retina evoke an entire universe, as Vladimir Nabokov does in the opening lines of Lolita:
Lolita, light of my life, fire of my loins. My sin, my soul. Lo-lee-ta: the tip of the tongue taking a trip of three steps down the palate to tap, at three, on the teeth. Lo. Lee. Ta.
The reader’s brain contains a complicated set of mechanisms admirably attuned to reading. For a great many centuries, this talent remained a mystery. Today, the brain’s black box is cracked open and a true science of reading is coming into being. Advances in psychology and neuroscience over the last twenty years have begun to unravel the principles underlying the brain’s reading circuits. Modern brain imaging methods now reveal, in just a matter of minutes, the brain areas that activate when we decipher written words. Scientists can track a printed word as it progresses from the retina through a chain of processing stages, each of which is marked by an elementary question: Are these letters? What do they look like? Are they a word? What does it sound like? How is it pronounced? What does it mean?
On this empirical ground, a theory of reading is materializing. It postulates that the brain circuitry inherited from our primate evolution can be co-opted to the task of recognizing printed words. According to this approach, our neuronal networks are literally “recycled” for reading. The insight into how literacy changes the brain is profoundly transforming our vision of education and learning disabilities. New remediation programs are being conceived that should, in time, cope with the debilitating incapacity to decipher words known as dyslexia.
My purpose in this book is to share my knowledge of recent and little-known advances in the science of reading. In the twenty-first century, the average person still has a better idea of how a car works than of the inner functioning of his own brain—a curious and shocking state of affairs. Decision makers in our education systems swing back and forth with the changing winds of pedagogical reform, often blatantly ignoring how the brain actually learns to read. Parents, educators, and politicians often recognize that there is a gap between educational programs and the most up-to-date findings in neuroscience. But too frequently their idea of how this field can contribute to advances in education is only grounded in a few color pictures of the brain at work. Unfortunately, the imaging techniques that allow us to visualize brain activity are subtle and occasionally misleading. The new science of reading is so young and fast-moving that it is still relatively unknown outside the scientific community. My goal is to provide a simple introduction to this exciting field, and to increase awareness of the amazing capacities of our reading brains.
From Neurons to Education
Reading acquisition is a major step in child development. Many children initially struggle with reading, and surveys indicate that about one adult in ten fails to master even the rudiments of text comprehension. Years of hard work are needed before the clockwork-like brain machinery that supports reading runs so smoothly that we forget it exists.
Why is reading so difficult to master? What profound alterations in brain circuitry accompany the acquisition of reading? Are some teaching strategies better adapted to the child’s brain than others? What scientific reasons, if any, explain why phonics—the systematic teaching of letter-to-sound correspondences—seems to work better than whole-word teaching? Although much still remains to be discovered, the new science of reading is now providing increasingly precise answers to all these questions. In particular, it underlines why early research on reading erroneously supported the whole-word approach—and how recent research on the brain’s reading networks proves it was wrong.
Understanding what goes into reading also sheds light on its pathologies. In our explorations of the reader’s mind and brain, you will be introduced to patients who suddenly lost the ability to read following a stroke. I will also analyze the causes of dyslexia, whose cerebral underpinnings are gradually coming to light. It is now clear that the dyslexic brain is subtly different from the brain of a normal reader. Several dyslexia susceptibility genes have been identified. But this is by no means a reason for discouragement or resignation. New intervention therapies are now being defined. Intensive retraining of language and reading circuits has brought about major improvements in children’s brains that can readily be tracked with brain imaging.
Putting Neurons into Culture
Our ability to read brings us face-to-face with the singularity of the human brain. Why is Homo sapiens the only species that actively teaches itself? Why is he unique in his ability to transmit a sophisticated culture? How does the biological world of synapses and neurons relate to the universe of human cultural inventions? Reading, but also writing, mathematics, art, religion, agriculture, and city life have dramatically increased the native capacities of our primate brains. Our species alone rises above its biological condition, creates an artificial cultural environment for itself, and teaches itself new skills like reading. This uniquely human competence is puzzling and calls for a theoretical explanation.
One of the basic techniques in the neurobiologist’s toolkit consists of “putting neurons in culture”—letting neurons grow in a petri dish. In this book, I call for a different “culture of neurons”—a new way of looking at human cultural activities, based on our understanding of how they map onto the brain networks that support them. Neuroscience’s avowed goal is to describe how the elementary components of the nervous system lead to the behavioral regularities that can be observed in children and adults (including advanced cognitive skills). Reading provides one of the most appropriate test beds for this “neurocultural” approach. We are increasingly aware of how writing systems as different as Chinese, Hebrew, or English get inscribed in our brain circuits. In the case of reading, we can clearly draw direct links between our native neuronal architecture and our acquired cultural abilities—but the hope is that this neuroscience approach will extend to other major domains of human cultural expression.
The Mystery of the Reading Ape
If we are to reconsider the relation between brain and culture, we must address an enigma, which I call the reading paradox: Why does our primate brain read? Why does it have an inclination for reading although this cultural activity was invented only a few thousand years ago?
There are good reasons why this deceptively simple question deserves to be called a paradox. We have discovered that the literate brain contains specialized cortical mechanisms that are exquisitely attuned to the recognition of written words. Even more surprisingly, the same mechanisms, in all humans, are systematically housed in identical brain regions, as though there were a cerebral organ for reading.
But writing was born only fifty-four hundred years ago in the Fertile Crescent, and the alphabet itself is only thirty-eight hundred years old. These time spans are a mere trifle in evolutionary terms. Evolution thus did not have the time to develop specialized reading circuits in Homo sapiens. Our brain is built on the genetic blueprint that allowed our hunter-gatherer ancestors to survive. We take delight in reading Nabokov and Shakespeare using a primate brain originally designed for life in the African savanna. Nothing in our evolution could have prepared us to absorb language through vision. Yet brain imaging demonstrates that the adult brain contains fixed circuitry exquisitely attuned to reading.
The reading paradox is reminiscent of the Reverend William Paley’s parable aimed at proving the existence of God. In his Natural Theology (1802), he imagined that in a deserted heath, a watch was found on the ground, complete with its intricate inner workings clearly designed to measure time. Wouldn’t it provide, he argued, clear proof that there is an intelligent clockmaker, a designer who purposely created the watch? Similarly, Paley maintained that the intricate devices that we find in living organisms, such as the astonishing mechanisms of the eye, prove that nature is the work of a divine watchmaker.
Charles Darwin famously refuted Paley by showing how blind natural selection can produce highly organized structures. Even if biological organisms at first glance seem designed for a specific purpose, closer examination reveals that their organization falls short of the perfection that one would expect from an omnipotent architect. All sorts of imperfections attest that evolution is not guided by an intelligent creator, but follows random paths in the struggle for survival. In the retina, for example, blood vessels and nerve cables are situated in front of the photoreceptors, thus partially blocking incoming light and creating a blind spot—very poor design indeed.
Following in Darwin’s footsteps, Stephen Jay Gould provided many examples of the imperfect outcome of natural selection, including the panda’s thumb.1 The British evolutionist Richard Dawkins also explained how the delicate mechanisms of the eye or of the wing could only have emerged through natural selection or are the work of a “blind watchmaker.”2 Darwin’s evolutionism seems to be the only source of apparent “design” in nature.
When it comes to explaining reading, however, Paley’s parable is problematic in a subtly different way. The clockwork-like brain mechanisms that support reading are certainly comparable in complexity and sheer design to those of the watch abandoned on the heath. Their entire organization leans toward the single apparent goal of decoding written words as quickly and accurately as possible. Yet neither the hypothesis of an intelligent creator nor that of slow emergence through natural selection seems to provide a plausible explanation for the origins of reading. Time was simply too short for evolution to design specialized reading circuits. How, then, did our primate brain learn to read? Our cortex is the outcome of millions of years of evolution in a world without writing—why can it adapt to the specific challenges posed by written word recognition?
Biological Unity and Cultural Diversity
In the social sciences, the acquisition of cultural skills such as reading, mathematics, or the fine arts is rarely, if ever, posed in biological terms. Until recently, very few social scientists considered that brain biology and evolutionary theory were even relevant to their fields. Even today, most implicitly subscribe to a naïve model of the brain, tacitly viewing it as an infinitely plastic organ whose learning capacity is so broad that it places no constraints on the scope of human activity. This is not a new idea. It can be traced back to the theories of the British empiricists John Locke, David Hume, and George Berkeley, who claimed that the human brain should be compared to a blank slate that progressively absorbs the imprint of man’s natural and cultural environment through the five senses.
This view of mankind, which denies the very existence of a human nature, has often been adopted without question. It belongs to the default “standard social science model”3 shared by many anthropologists, sociologists, some psychologists, and even a few neuroscientists who view the cortical surface as “largely equipotent and free of domain-specific structure.”4 It holds that human nature is constructed, gradually and flexibly, through cultural impregnation. As a result, children born to the Inuit, to the hunter-gatherers of the Amazon, or to an Upper East Side New York family, according to this view, have little in common. Even color perception, musical appreciation, or the notion of right and wrong should vary from one culture to the next, simply because the human brain has few stable structures other than the capacity to learn.
Empiricists further maintain that the human brain, unhindered by the limitations of biology and unlike that of any other animal species, can absorb any form of culture. From this theoretical perspective, to talk about the cerebral bases of cultural inventions such as reading is thus downright irrelevant—much like analyzing the atomic composition of a Shakespeare play.
In this book, I refute this simplistic view of an infinite adaptability of the brain to culture. New evidence on the cerebral circuits of reading demonstrates that the hypothesis of an equipotent brain is wrong. To be sure, if the brain were not capable of learning, it could not adapt to the specific rules for writing English, Japanese, or Arabic. This learning, however, is tightly constrained, and its mechanisms themselves rigidly specified by our genes. The brain’s architecture is similar in all members of the Homo sapiens family, and differs only slightly from that of other primates. All over the world, the same brain regions activate to decode a written word. Whether in French or in Chinese, learning to read necessarily goes through a genetically constrained circuit.
On the basis of these data, I propose a novel theory of neurocultural interactions, radically opposed to cultural relativism, and capable of resolving the reading paradox. I call it the “neuronal recycling” hypothesis. According to this view, human brain architecture obeys strong genetic constraints, but some circuits have evolved to tolerate a fringe of variability. Part of our visual system, for instance, is not hardwired, but remains open to changes in the environment. Within an otherwise well-structured brain, visual plasticity gave the ancient scribes the opportunity to invent reading.
In general, a range of brain circuits, defined by our genes, provides “pre-representations”5 or hypotheses that our brain can entertain about future developments in its environment. During brain development, learning mechanisms select which pre-representations are best adapted to a given situation. Cultural acquisition rides on this fringe of brain plasticity. Far from being a blank slate that absorbs everything in its surroundings, our brain adapts to a given culture by minimally turning its predispositions to a different use. It is not a tabula rasa within which cultural constructions are amassed, but a very carefully structured device that manages to convert some of its parts to a new use. When we learn a new skill, we recycle some of our old primate brain circuits—insofar, of course, as those circuits can tolerate the change.
A Reader’s Guide
In forthcoming chapters, I will explain how neuronal recycling can account for literacy, its mechanisms in the brain, and even its history. In the first three chapters, I analyze the mechanisms of reading in expert adults. Chapter 1 sets the stage by looking at reading from a psychological angle: how fast do we read, and what are the main determinants of reading behavior? In chapter 2, I move to the brain areas at work when we read, and how they can be visualized using modern brain imaging techniques. Finally, in chapter 3, I come down to the level of single neurons and their organization into the circuits that recognize letters and words.
I tackle my analysis in a resolutely mechanical way. I propose to expose the cogwheels of the reader’s brain in much the same way as the Reverend Paley suggested we dismantle the watch abandoned on the heath. The reader’s brain will not, however, reveal any perfect clockwork mechanics designed by a divine watchmaker. Our reading circuits contain more than a few imperfections that betray our brain’s compromise between what is needed for reading and the available biological mechanisms. The peculiar characteristics of the primate visual system explain why reading does not operate like a fast and efficient scanner. As we move our eyes across the page, each word is slowly brought into the central region of our retina, only to be exploded into a myriad of fragments that our brain later pieces back together. It is only because these processes have become automatic and unconscious, thanks to years of practice, that we are under the illusion that reading is simple and effortless.
The reading paradox expresses the indisputable fact that our genes have not evolved in order to enable us to read. My reasoning in the face of this enigma is quite simple. If the brain did not evolve for reading, the opposite must be true: writing systems must have evolved within our brain’s constraints. Chapter 4 revisits the history of writing in this light, starting with the first prehistoric symbols and ending with the invention of the alphabet. At each step, there is evidence of constant cultural tinkering. Over many millennia, the scribes struggled to design words, signs, and alphabets that could fit the limits of our primate brain. To this day, the world’s writing systems still share a number of design features that can ultimately be traced back to the restrictions imposed by our brain circuits.
Continuing on the idea that our brain was not designed for reading, but recycles some of its circuits for this novel cultural activity, chapter 5 examines how children learn to read. Psychological research concludes that there are not many ways to convert a primate brain into that of an expert reader. This chapter explores in some detail the only developmental trajectory that appears to exist. Schools might be well advised to exploit this knowledge to optimize the teaching of reading and mitigate the dramatic effects of illiteracy and dyslexia.
I will also go on to show how a neuroscientific approach can shed light on the more mysterious features of reading acquisition. For instance, why do so many children often write their first words from right to left? Contrary to the accepted idea, these mirror inversion errors are not the first signs of dyslexia, but a natural consequence of the organization of our visual brain. In a majority of children, dyslexia relates to another, quite distinct anomaly in processing speech sounds. The description of the symptoms of dyslexia, their cerebral bases, and the most recent discoveries concerning its genetic foundations are covered in chapter 6, while chapter 7 provides an insight into what mirror errors can tell us about normal visual recognition.
Finally, in chapter 8, I will return to the astonishing fact that only our species is capable of cultural inventions as sophisticated as reading—a unique feat, unmatched by any other primate. In total opposition to the standard social science model, where culture gets a free ride on a blank-slate brain, reading demonstrates how culture and brain organization are inextricably linked. Throughout their long cultural history, human beings progressively discovered that they could reuse their visual systems as surrogate language inputs, thus arriving at reading and writing. I will also briefly discuss how other major human cultural traits could be submitted to a similar analysis. Mathematics, art, music, and religion might also be looked on as evolved devices, shaped by centuries of cultural evolution, that have encroached on our primate brains.
One last enigma remains: if learning exists in all primates, why is Homo sapiens the only species with a sophisticated culture? Although the term is sometimes applied to chimpanzees, their “culture” barely goes beyond a few good tricks for splitting nuts, washing potatoes or fishing ants with a stick—nothing comparable to the seemingly endless human production of interlocking conventions and symbols systems, including languages, religions, art forms, sports, mathematics or medicine. Nonhuman primates can slowly learn to recognize novel symbols such as letters and digits—but they never think of inventing them. In my conclusion, I propose some tentative ideas on the singularity of the human brain. The uniqueness of our species may arise from a combination of two factors: a theory of mind (the ability to imagine the mind of others) and a conscious global workspace (an internal buffer where an infinite variety of ideas can be recombined). Both mechanisms, inscribed in our genes, conspire to make us the only cultural species. The seemingly infinite variety of human cultures is only an illusion, caused by the fact that we are locked in a cognitive vicious circle: how could we possibly imagine forms other than those our brains can conceive? Reading, although a recent invention, lay dormant for millennia within the envelope of potentialities inscribed in our brains. Behind the apparent diversity of human writing systems lies a core set of universal neuronal mechanisms that, like a watermark, reveal the constraints of human nature.
How Do We Read?
Written word processing starts in our eyes. Only the center of the retina, called the fovea, has a fine enough resolution to allow for the recognition of small print. Our gaze must therefore move around the page constantly. Whenever our eyes stop, we only recognize one or two words. Each of them is then split up into myriad fragments by retinal neurons and must be put back together before it can be recognized. Our visual system progressively extracts graphemes, syllables, prefixes, suffixes, and word roots. Two major parallel processing routes eventually come into play: the phonological route, which converts letters into speech sounds, and the lexical route, which gives access to a mental dictionary of word meanings.
The existence of the text is a silent existence, silent until the moment in which a reader reads it. Only when the able eye makes contact with the markings on the tablet does the text come to active life. All writing depends on the generosity of the reader.
—ALBERTO MANGUEL, THE HISTORY OF READING
At first sight, reading seems close to magical: our gaze lands on a word, and our brain effortlessly gives us access to its meaning and pronunciation. But in spite of appearances, the process is far from simple. Upon entering the retina, a word is split up into a myriad of fragments, as each part of the visual image is recognized by a distinct photoreceptor. Starting from this input, the real challenge consists in putting the pieces back together in order to decode what letters are present, to figure out the order in which they appear, and finally to identify the word.
Over the past thirty years, cognitive psychology has worked on analyzing the mechanics of reading. Its goal is to crack the “algorithm” of visual word recognition—the series of processing steps that a proficient reader applies to the problem of identifying written words. Psychologists treat reading like a computer science problem. Every reader resembles a robot with two cameras—the two eyes and their retinas. The words we read are painted onto them. They first appear only as splotches of light and dark that are not directly interpretable as linguistic signs. Visual information must be recoded in an understandable format before we can access the appropriate sounds, words, and meanings. Thus we must have a deciphering algorithm, or a processing recipe akin to automatic character recognition software, which takes the pixels on a page as input and produces the identity of the words as output. To accomplish this feat, unbeknownst to us, our brain hosts a sophisticated set of decoding operations whose principles are only beginning to be understood.
The Eye: A Poor Scanner
The tale of reading begins when the retina receives photons reflected off the written page. But the retina is not a homogeneous sensor. Only its central part, called the fovea, is dense in high-resolution cells sensitive to incoming light, while the rest of the retina has a coarser resolution. The fovea, which occupies about 15 degrees of the visual field, is the only part of the retina that is genuinely useful for reading. When foveal information is lacking, whether due to a retinal lesion, to a stroke having destroyed the central part of the visual cortex, or to an experimental trick that selectively blocks visual inputs to the fovea, reading becomes impossible.6
The need to bring words into the fovea explains why our eyes are in constant motion when we read. By orienting our gaze, we “scan” text with the most sensitive part of our vision, the only one that has the resolution needed to determine letters. However, our eyes do not travel continuously across the page.7 Quite the opposite: they move in small steps called saccades. At this very moment, you are making four or five of these jerky movements every second, in order to bring new information to your fovea.
Even within the fovea, visual information is not represented with the same precision at all points. In the retina as well as in the subsequent visual relays of the thalamus and of the cortex, the number of cells allocated to a given portion of the visual scene decreases progressively as one moves away from the center of gaze. This causes a gradual loss of visual precision. Visual accuracy is optimal at the center and smoothly decreases toward the periphery. We have the illusion of seeing the whole scene in front of us with the same fixed accuracy, as if it were filmed by a digital camera with a homogeneous array of pixels. However, unlike the camera, our eye sensor accurately perceives only the precise point where our gaze happens to land. The surroundings are lost in an increasingly hazy blurriness (figure 1.1).8
Figure 1.1 The retina stringently filters what we read. In this simulation, a page from Samuel Johnson’s The Adventurer (1754) was filtered using an algorithm that copies the decreasing acuity of human vision away from the center of the retina. Regardless of size, only letters close to fixation can be identified. This is why we constantly explore pages with jerky eye movements when we read. When our gaze stops, we can only identify one or two words.
One might think that, under these conditions, it is the absolute size of printed characters that determines the ease with which we can read: small letters should be harder to read than larger ones. Oddly enough, however, this is not the case. The reason is that the larger the characters, the more room they use on the retina. When a whole word is printed in larger letters, it moves into the periphery of the retina, where even large letters are hard to discern. The two factors compensate for each other almost exactly, so that an enormous word and a minuscule one are essentially equivalent from the point of view of retinal precision. Of course, this is only true provided that the size of the characters remains larger than an absolute minimum, which corresponds to the maximal precision attained at the center of our fovea. When visual acuity is diminished, for instance in aging patients, it is quite logical to recommend books in large print.
Because our eyes are organized in this way, our perceptual abilities depend exclusively on the number of letters in words, not on the space these words occupy on our retina.9 Indeed, our saccades when we read vary in absolute size, but are constant when measured in numbers of letters. When the brain prepares to move our eyes, it adapts the distance to be covered to the size of the characters, in order to ensure that our gaze always advances by about seven to nine letters. This value, which is amazingly small, thus corresponds approximately to the information that we can process in the course of a single eye fixation.
To prove that we see only a very small part of each page at a time, George W. McConkie and Keith Rayner developed an experimental method that I like to call the “Cartesian devil.” In his Metaphysical Meditations, René Descartes imagined that an evil genius was playing with our senses:
I shall then suppose, not that God who is supremely good and the fountain of truth, but some evil genius not less powerful than deceitful, has employed all his energy to deceive me; I shall consider that the heavens, the earth, colors, figures, sound, and all other external things are naught but the illusions and dreams of which this genius has availed himself in order to lay traps for my credulity. I shall consider myself as having no hands, no eyes, no flesh, no blood, nor any senses, yet falsely believing myself to possess all these things.
Much like the supercomputer in the Matrix movies, Descartes’ evil genius produces a pseudo-reality by bombarding our senses with signals carefully crafted to create an illusion of real life, a virtual scene whose true side remains forever hidden. More modestly, McConkie and Rayner designed a “moving window” that creates an illusion of text on a computer screen.10 The method consists in equipping a human volunteer with a special device that tracks eye movements and can change the visual display in real time. The device can be programmed to display only a few characters left and right of the center of gaze, while all of the remaining letters on the page are replaced with strings of x’s:
As soon as the eyes move, the computer discreetly refreshes the display. Its goal is to show the appropriate letters at the place where the person is looking, and strings of x’s everywhere else:
Using this device, McConkie and Rayner made a remarkable and paradoxical discovery. They found that the participants did not notice the manipulation. As long as enough letters are presented left and right of fixation, a reader fails to detect the trick and believes that he is looking at a perfectly normal page of text.
This surprising blindness occurs because the eye attains its maximum speed at the point when the letter change occurs. This trick makes the letter changes hard to detect, because at this very moment the whole retinal image is blurred by motion. Once gaze lands, everything looks normal: within the fovea, the expected letters are in place, and the rest of the visual field, on the periphery, cannot be read anyway. McConkie and Rayner’s experiment thus proves that we consciously process only a very small subset of our visual inputs. If the computer leaves four letters on the left of fixation, and fifteen letters on the right, reading speed remains normal.11 In brief, we extract very little information at a time from the written page. Descartes’ evil genius would only have to display twenty letters per fixation to make us believe that we were reading the Bible or the U.S. Constitution!
Twenty letters is, in fact, an overestimate. We identify only ten or twelve letters per saccade: three or four to the left of fixation, and seven or eight to the right. Beyond this point, we are largely insensitive to letter identity and merely encode the presence of the spaces between words. By providing cues about word length, the spaces allow us to prepare our eye movements and ensure that our gaze lands close to the center of the next word. Experts continue to debate about the extent to which we extract information from an upcoming word—perhaps only the first few letters. Everyone agrees, however, that the direction of reading imposes asymmetry on our span of vision. In the West, visual span is much greater toward the right side, but in readers of Arabic or Hebrew, where gaze scans the page from right to left, this asymmetry is reversed.12 In other writing systems such as Chinese, where character density is greater, saccades are shorter and visual span is reduced accordingly. Each reader thus adapts his visual exploration strategy to his language and script.
Using the same method, we can also estimate how much time is needed to encode the identity of words. A computer can be programmed so that, after a given duration, all of the letters are replaced by a string of x’s, even in the fovea. This experiment reveals that fifty milliseconds of presentation are enough for reading to proceed at an essentially normal pace. This does not mean that all of the mental operations involved in reading are completed in one-twentieth of a second. As we shall see, a whole pipeline of mental processes continues to operate for at least one-half second after the word has been presented. However, the initial intake of visual information can be very brief.
In summary, our eyes impose a lot of constraints on the act of reading. The structure of our visual sensors forces us to scan the page by jerking our eyes around every two or three tenths of a second. Reading is nothing but the word-by-word mental restitution of a text through a series of snapshots. While some small grammatical words like “the,” “it,” or “is” can sometimes be skipped, almost all content words such as nouns and verbs have to be fixated at least once.
These constraints are an integral part of our visual apparatus and cannot be lifted by training. One can certainly teach people to optimize their eye movement patterns, but most good readers, who read from four hundred to five hundred words per minute, are already close to optimal. Given the retinal sensor at our disposal, it is probably not possible to do much better. A simple demonstration proves that eye movements are the rate-limiting step in reading.13 If a full sentence is presented, word by word, at the precise point where gaze is focalized, thus avoiding the need for eye movements, a good reader can read at staggering speed—a mean of eleven hundred words per minute, and up to sixteen hundred words per minute for the best readers, which is about one word every forty milliseconds and three to four times faster than normal reading! With this method, called rapid sequential visual presentation, or RSVP, identification and comprehension remain satisfactory, thus suggesting that the duration of those central steps does not impose a strong constraint on normal reading. Perhaps this computerized presentation mode represents the future of reading in a world where screens progressively replace paper.
At any rate, as long as text is presented in pages and lines, acquisition through gaze will slow reading and impose an unavoidable limitation. Thus, fast reading methods that advertise gains in reading speed of up to one thousand words per minute or more must be viewed with skepticism.14 One can no doubt broaden one’s visual span somewhat, in order to reduce the number of saccades per line, and it is also possible to learn to avoid moments of regression, where gaze backtracks to the words it has just read. However, the physical limits of the eyes cannot be overcome, unless one is willing to skip words and thus run the risk of a misunderstanding. Woody Allen described this situation perfectly: “I took a speed-reading course and was able to read War and Peace in twenty minutes. It involves Russia.”
The Search for Invariants
Can you read, Lubin?
Yes, I can read printed letters, but I was never able to read handwriting.
—MOLIÈRE, GEORGES DANDIN
Reading poses a difficult perceptual problem. We must identify words regardless of how they appear, whether in print or handwritten, in upper- or lowercase, and regardless of their size. This is what psychologists call the invariance problem: we need to recognize which aspect of a word does not vary—the sequence of letters—in spite of the thousand and one possible shapes that the actual characters can take on.
If perceptual invariance is a problem, it is because words are not always in the same location, in the same font, or in the same size. If they were, just listing which of the cells on the retina are active and which are not would suffice to decode a word, much like a black-and-white computer image is defined by the list of its pixels. In fact, however, hundreds of different retinal images can stand for the same word, depending on the form in which it is written (figure 1.2). Thus one of the first steps in reading must be to correct for the immense variety of those surface forms.
Figure 1.2 Visual invariance is one of the prime features of the human reading system. Our word recognition device meets two seemingly contradictory requirements: it neglects irrelevant variations in character shape, even if they are huge, but amplifies relevant differences, even if they are tiny. Unbeknownst to us, our visual system automatically compensates for enormous variations in size or font. Yet it also attends to minuscule changes in shape. By turning an “s” into an “e,” and therefore “sight” into “eight,” a single mark drastically reorients the processing chain toward entirely distinct pronunciations and meanings.
Several cues suggest that our brain applies an efficient solution to this perceptual invariance problem. When we hold a newspaper at a reasonable distance, we can read both the headlines and the classified ads. Word size can vary fiftyfold without having much impact on our reading speed. This task is not very different from that of recognizing the same face or object from a distance of two feet or thirty yards—our visual system tolerates vast changes in scale.
A second form of invariance lets us disregard the location of words on the page. As our gaze scans a page, the center of our retina usually lands slightly left of the center of words. However, our targeting is far from perfect, and our eyes sometimes reach the first or last letter without this preventing us from recognizing the word. We can even read words presented on the periphery of our visual field, provided that letter size is increased to compensate for the loss of retinal resolution. Thus size constancy goes hand in hand with normalization for spatial location.
Finally, word recognition is also largely invariant for character shape. Now that word processing software is omnipresent, technology that was formerly reserved to a small elite of typographers has become broadly available. Everyone knows that there are many sets of characters called “fonts” (a term left over from the time when each character had to be cast in lead at a type foundry before going to the press). Each font also has two kinds of characters called “cases,” the UPPERCASE and the lowercase (originally the case was a flat box divided into many compartments where lead characters were sorted; the “upper case” was reserved for capital letters, and the “lower case” for the rest). Finally, one can choose the “weight” of a font (normal or bold characters), its inclination (italics, originally invented in Italy), whether it is underlined or not, and any combination of these options. These well-calibrated variations in fonts, however, are nothing compared to the enormous variety of writing styles. Manuscript handwriting obviously takes us to another level of variability and ambiguity.
In the face of all these variations, exactly how our visual system learns to categorize letter shapes remains somewhat mysterious. Part of this invariance problem can be solved using relatively simple means. The vowel “o,” for instance, can easily be recognized, regardless of size, case, or font, thanks to its unique closed shape. Thus, building a visual o detector isn’t particularly difficult. Other letters, however, pose specific problems. Consider the letter “r,” for instance. Although it seems obvious that the shapes r, R, and all represent the same letter, careful examination shows that this association is entirely arbitrary—the shape e, for instance, might serve as well as the lowercase version of the letter “R.” Only the accidents of history have left us this cultural oddity. As a result, when we learn to read, we must not only learn that letters map onto the sounds of language, but also that each letter can take on many unrelated shapes. As we shall see, our capacity to do this probably comes from the existence of abstract letter detectors, neurons that can recognize the identity of a letter in its various guises. Experiments show that very little training suffices to DeCoDe, At An EsSeNtIaLly NoRmAl SpEeD, EnTiRe SeNtEnCes WhOsE LeTtErS HaVe BeEn PrInTeD AlTeRnAtElY iN uPpErCaSe aNd In LoWeRcAsE.15 In the McConkie and Rayner “evil genius” computer, this letter-case alternation can be changed in between every eye saccade, totally unbeknownst to the reader!16 In our daily reading experience, we never see words presented in alternating case, but our letter normalization processes are so efficient that they easily resist such transformation.
In passing, these experiments demonstrate that global word shape does not play any role in reading. If we can immediately recognize the identity of “words,” “WORDS,” and “WoRdS,” it is because our visual system pays no attention to the contours of words or to the pattern of ascending and descending letters: it is only interested in the letters they contain. Obviously, our capacity to recognize words does not depend on an analysis of their overall shape.
Although our visual system efficiently filters out visual differences that are irrelevant to reading, such as the distinction between “R” and “r,” it would be a mistake to think that it always discards information and simplifies shapes. On the contrary, in many cases it must preserve, and even amplify, the minuscule details that distinguish two very similar words from each other. Consider the words “eight” and “sight.” We immediately access their very distinct meanings and pronunciations, but it is only when we look more closely at them that we realize that the difference is only a few pixels. Our visual system is exquisitely sensitive to the minuscule difference between “eight” and “sight,” and it amplifies it in order to send the input to completely different regions of semantic space. At the same time, it pays very little attention to other much greater differences, such as the distinction between “eight” and “EIGHT.”
As with invariance for case, this capacity to attend to relevant details results from years of training. The same reader who immediately spots the difference between letters “e” and “o,” and the lack of difference between “a” and “a,” may not notice that the Hebraic letters and differ sharply, a fact that seems obvious to any Hebrew reader.
Every Word Is a Tree
Our visual system deals with the problem of invariant word recognition using a well-organized system. As we shall see in detail in chapter 2, the flow of neuronal activity that enters into the visual brain gets progressively sorted into meaningful categories. Shapes that appear very similar, such as “eight” and “sight,” are sifted through a series of increasingly refined filters that progressively separate them and attach them to distinct entries in a mental lexicon, a virtual dictionary of all the words we have ever encountered. Conversely, shapes like “eight” and “EIGHT,” which are composed of distinct visual features, are initially encoded by different neurons in the primary visual area, but are progressively recoded until they become virtually indistinguishable. Feature detectors recognize the similarity of the letters “i” and “I.” Other, slightly more abstract letter detectors classify “e” and “E” as two forms of the same letter. In spite of the initial differences, the reader’s visual system eventually encodes the very essence of the letter strings “eight” and “EIGHT,” regardless of their exact shape. It gives these two strings the same single mental address, an abstract code capable of orienting the rest of the brain toward the pronunciation and meaning of the word.
What does this address look like? According to some models, the brain uses a sort of unstructured list that merely provides the sequence of letters E-I-G-H-T. In others, it relies on a very abstract and conventional code, akin to a random cipher by which, say,  would be the word “eight” and  the word “sight.” Contemporary research, however, supports another hypothesis. Every written word is probably encoded by a hierarchical tree in which letters are grouped into larger-sized units, which are themselves grouped into syllables and words—much as a human body can be represented as an arrangement of legs, arms, torso, and head, each of which can be further broken down into simple parts.
A good example of the mental decomposition of words into relevant units can be found if we dissect the word “unbuttoning.” We must first strip off the prefix “un” and the familiar suffix or grammatical ending “ing.” Both frame the central element, the word inside the word: the root “button.” All three of these components are called “morphemes”—the smallest units that carry some meaning. Each word is characterized, at this level, by how its morphemes are put together. Breaking down a word into its morphemes even allows us to understand words that we have never seen before, such as “reunbutton” or “deglochization” (we understand that this is the undoing of the action of “gloching,” whatever that may be). In some languages, such as Turkish or Finnish, morphemes can be assembled into very large words that convey as much information as a full English sentence. In those languages, but also in ours, the decomposition of a word into its morphemes is an essential step on the path that leads from vision to meaning.
A lot of experimental data show that, very quickly and even downright unconsciously, our visual system snips out the morphemes of words. For instance, if I were to flash the word “departure” on a computer screen, you would later say the word “depart” slightly faster when confronted with it. The presentation of “departure” seems to preactivate the morpheme [depart], thus facilitating its access. Psychologists speak of a “priming” effect—the reading of a word primes the recognition of related words, much as one primes a pump. Importantly, this priming effect does not depend solely on visual similarity: words that look quite different but share a morpheme, such as “can” and “could,” can prime each other, whereas words that look alike but bear no intimate morphological relation, such as “aspire” and “aspirin,” do not. Priming also does not require any resemblance at the level of meaning; words such as “hard” and “hardly,” or “depart” and “department,” can prime each other, even though their meanings are essentially unrelated.17 Getting down to the morpheme level seems to be of such importance for our reading system that it is willing to make guesses about the decomposition of words. Our reading apparatus dissects the word “department” into [depart] + [ment] in the hope that this will be useful to the next operators computing its meaning.18 Never mind that this does not work all the time—a “listless” person is not one who is waiting for a grocery list, nor does sharing an “apartment” imply that you and your partner will soon live apart. Such parsing errors will have to be caught at other stages in the word dissection process.
If we continue to undress the word “unbuttoning,” the morpheme [button] itself is not an indivisible whole. It is made up of two syllables, [bΛ] and [ton], each of which can be broken down into individual consonants and vowels: [b] [Λ] [t] [o] [n]. Here lies another essential unit in our reading system: the grapheme, a letter or series of letters that maps onto a phoneme in the target language. Note that in our example, the two letters “tt” map onto a single sound t.19 Indeed, the mapping of graphemes onto phonemes isn’t always a direct process. In many languages, graphemes can be constructed out of a group of letters. English has a particularly extensive collection of complex graphemes such as “ough,” “oi,” and “au.”
Our visual system has learned to treat these groups of letters as bona fide units, to the point where we no longer pay attention to their actual letter content. Let us do a simple experiment to prove this point. Examine the following list of words and mark those that contain the letter “a”:
Did you feel that you had to slow down, ever so slightly, for the last three words, “coat,” “please,” and “meat”? They all contain the letter “a,” but it is embedded in a complex grapheme that is not pronounced like an “a.” If we were to rely only on letter detectors in order to detect the letter “a,” the parsing of the word into its graphemes would not matter. However, actual measurement of response times clearly shows that our brain does not stop at the single-letter level. Our visual system automatically regroups letters into higher-level graphemes, thus making it harder for us to see that groups of letters such as “ea” actually contain the letter “a.”20
In turn, graphemes are automatically grouped into syllables. Here is another simple demonstration of this fact. You are going to see five-letter words. Some letters are in bold type, others in a normal font. Concentrate solely on the middle letter, and try to decide if it is printed in normal or in bold type:
List 1: HORNY RIDER GRAVY FILET
List 2: VODKA METRO HANDY SUPER
Did you feel that the first list was slightly more difficult than the second? In the first list, the bold characters do not respect syllable boundaries—in “RIDER,” for instance, the “D” is printed in bold type while the rest of the syllable is in normal type. Our mind tends to group together the letters that make up a syllable, thus creating a conflict with the bold type that leads to a measurable slowing down of responses.21 This effect shows that our visual system cannot avoid automatically carving up words into their elementary constituents even when it would be better not to do so.
The nature of these constituents remains a hot topic in research. It would appear that multiple levels of analysis can coexist: a single letter at the lowest level, then a pair of letters (or “bigram,” an important unit to which we will return later), the grapheme, the syllable, the morpheme, and finally the whole word. The final point in visual processing leaves the word parsed out into a hierarchical structure, a tree made up of branches of increasing sizes whose leaves are the letters.
Reduced to a skeleton, stripped of all its irrelevant features like font, case, and size, the letter string is thus broken down into the elementary components that will be used by the rest of the brain to compute sound and meaning.
The Silent Voice
Writing—this ingenious art to paint words and speech for the eyes.
—GEORGES DE BRÉBEUF (FRENCH POET, 1617–1661)
When he paid a visit to Ambrose, then bishop of Milan, Augustine observed a phenomenon that he judged strange enough to be worth noting in his memoirs:
When [Ambrose] read, his eyes scanned the page and his heart sought out the meaning, but his voice was silent and his tongue was still. Anyone could approach him freely and guests were not commonly announced, so that often, when we came to visit him, we found him reading like this in silence, for he never read aloud.22
In the middle of the seventh century, the theologian Isidore of Seville similarly marveled that “letters have the power to convey to us silently the sayings of those who are absent.” At the time, it was customary to read Latin aloud. To articulate sounds was a social convention, but also a true necessity: confronted with pages in which the words were glued together, without spaces, in a language that they did not know well, most readers had to mumble through the texts like young children. This is why Ambrose’s silent reading was so surprising, even if for us it has become a familiar experience: we can read without articulating sounds.
Whether our mind ever goes straight from the written word to its meaning without accessing pronunciation or whether it unconsciously transforms letters into sound and then sound into meaning has been the topic of considerable discussion The organization of the mental pathways for reading fueled a debate that divided the psychological community for over thirty years. Some thought that the transformation from print to sound was essential—written language, they argued, is just a by-product of spoken language, and we therefore have to sound the words out, through a phonological route, before we have any hope of recovering their meaning. For others, however, phonological recoding was just a beginner’s trait characteristic of young readers. In more expert readers, reading efficiency was based on a direct lexical route straight from the letter string to its meaning.
Nowadays, a consensus has emerged: in adults, both reading routes exist, and both are simultaneously active. We all enjoy direct access to word meaning, which spares us from pronouncing words mentally before we can understand them. Nevertheless, even proficient readers continue to use the sounds of words, even if they are unaware of it. Not that we articulate words covertly—we do not have to move our lips, or even prepare an intention to do so. At a deeper level, however, information about the pronunciation of words is automatically retrieved. Both the lexical and phonological pathways operate in parallel and reinforce each other.
There is abundant proof that we automatically access speech sounds while we read. Imagine, for instance, that you are presented with a list of strings and have to decide whether each one is a real English word or not. Mind you, you only have to decide if the letters spell out an English word. Here you go:
You perhaps hesitated when the letters sounded like a real word—as in “demon,” “carpet,” or “knee.” This interference effect can easily be measured in terms of response times. It implies that each string is converted into a sequence of sounds that is evaluated like a real word, even though the process goes against the requested task.23
Mental conversion into sound plays an essential role when we read a word for the first time—say, the string “Kalashnikov.” Initially, we cannot possibly access its meaning directly, since we have never seen the word spelled out. All we can do is convert it into sound, find that the sound pattern is intelligible, and, through this indirect route, come to understand the new word. Thus, sounding is often the only solution when we encounter a new word. It is also indispensable when we read misspelled words. Consider the little-known story by Edgar Allan Poe called “The Angel of the Odd.” In it, a strange character mysteriously intrudes into the narrator’s apartment, “a personage nondescript, although not altogether indescribable,” and with a German accent as thick as British fog:
“Who are you, pray?” said I, with much dignity, although somewhat puzzled; “how did you get here? And what is it you are talking about?”
“Az vor ow I com’d ere,” replied the figure, “dat iz none of your pizzness; and as vor vat I be talking apout, I be talk apout vat I tink proper; and as vor who I be, vy dat is de very ting I com’d here for to let you zee for yourzelf. . . . Look at me! Zee! I am te Angel ov te Odd.”
“And odd enough, too,” I ventured to reply; “but I was always under the impression that an angel had wings.”
“Te wing!” he cried, highly incensed, “vat I pe do mit te wing? Mein Gott! Do you take me vor a shicken?”
In reading this passage we return to a style we had long forgotten, one that dates back to our childhood: the phonological route, or the slow transformation of totally novel strings of letters into sounds that miraculously become intelligible, as though someone were whispering them at us.
What about everyday words, however, that we have already met a thousand times? We do not get the impression that we slowly decode through mental enunciation. However, clever psychological tests show that we still activate their pronunciation at a nonconscious level. For instance, suppose that you are asked to indicate which of the following words refer to parts of the human body. These are all very familiar words, so you should be able to focus on their meaning and neglect their pronunciation. Try it:
Perhaps you felt the urge to respond to the word “hare,” which sounds like a body part. Experiments show that we slow down and make mistakes on words that sound like an item in the target category.24 It is not clear how we could recognize this homophony if we did not first mentally retrieve the word’s pronunciation. Only an internal conversion into speech sounds can explain this type of error. Our brain cannot help but transform the letters “h-a-r-e” into internal speech and then associate it with a meaning—a process that can go wrong in rare cases where the string sounds like another well-known word.
Of course, this imperfect design is also what grants us one of the great pleasures of life: puns, or the “joy of text,” as the humorist Richard Lederer puts it. Without the gift of letter-to-sound conversion, we would not be able to enjoy Mae West’s wisecrack (“She’s the kind of girl who climbed the ladder of success wrong by wrong”) or Conan Doyle’s brother-in-law’s quip (“there’s no police like Holmes”). Without Augustine’s “silent voice,” the pleasure of risqué double entendres would be denied us:
An admirer says to President Lincoln, “Permit me to introduce my family. My wife, Mrs. Bates. My daughter, Miss Bates. My son, Master Bates.”
“Oh dear!” replied the president.25
Further proof that our brain automatically accesses a word’s sound patterns derives from subliminal priming. Suppose that I flash the word “LATE” at you, immediately followed by the word “mate,” and ask you to read the second word as fast as you can. The words are shown in a different case in order to avoid any low-level visual resemblance. Nevertheless, when the first word sounds and spells like the second, as in this example, we would observe a massive acceleration of reading time, in comparison with a situation where two words are not particularly related to one another (“BOWL” followed by “mate”). Part of this facilitation clearly arises from similarity at the level of spelling alone. To flash “MATH” eases recognition of “mate,” even though the two strings sound quite different. However, crucially, even greater facilitation can be found when two words share the same pronunciation (“LATE” followed by “mate”), and this sound-based priming works even when spelling is completely different (“EIGHT” followed by “mate”). Thus, pronunciation seems to be automatically extracted. As one might expect, however, spelling and sound are not encoded at the same time. It takes only twenty or thirty milliseconds of word viewing for our brain to automatically activate a word’s spelling, but an additional forty milliseconds for its transformation into sound, as revealed by the emergence of sound-based priming.26
Simple experiments thus lead us to outline a whole stream of successive stages in the reader’s brain, from marks on the retina to their conversion into letters and sounds. Any expert reader quickly converts strings into speech sounds effortlessly and unconsciously.
The Limits of Sound
Covert access to the pronunciation of written words is an automatic step in reading, but this conversion may not be indispensable. Speech-to-sound conversion is often slow and inefficient. Our brain thus often tries to retrieve a word’s meaning using a parallel and more direct pathway that leads straight from the letter string to the associated entry in the mind’s lexicon.
To boost our intuition about the direct lexical route, we have to consider the plight of a make-believe reader who would only be able to mentally enunciate written words. It would be impossible for him to discriminate between homophonic words like “maid” and “made,” “raise” and “raze,” “board” and “bored,” or “muscles” and “mussels.” Purely on the basis of sound, he might think that serial killers hate cornfields, and that one-carat diamonds are an odd shade of orange. The very fact that we readily discern the multiple meanings of such homophonic words shows that we are not obliged to pronounce them—another route is available that allows our brain to solve any ambiguity and go straight to their meaning.
A further problem exists for purely sound-based theories of reading: the route from spelling to sound is not a high-speed highway devoid of obstacles. To derive a word’s pronunciation from the sequence of its letters is often impossible in the absence of additional help. Consider the word “blood.” It seems obvious that it should be pronounced blud and that it rhymes with “bud” or “mud.” But how do we know this? Why shouldn’t “blood” rhyme with “food” or “good”? Why doesn’t it sound like “bloom” or “bloomer”? Even the same word root can be pronounced differently, as in “sign” and “signature.” Some words are so exceptional that it is hard to see how their pronunciation relates to their component letters (“colonel,” “yacht,” “though” . . .). In such cases, the word’s pronunciation cannot be computed without prior knowledge of the word.
English spelling bristles with irregularities. Indeed, the gap between written and spoken language is centuries old, as attested by William Shakespeare in Love’s Labour’s Lost, where the pedant Holofernes says:
I abhor such fanatical phantasimes, such insociable and point-devise companions; such rackers of orthography, as to speak dout, fine, when he should say doubt; det, when he should pronounce debt—d, e, b, t, not d, e, t: he clepeth a calf, cauf; half, hauf; neighbour vocatur nebor; neigh abbreviated ne. This is abhominable—which he would call abbominable: it insinuateth me of insanie.
English is an abominably irregular language. George Bernard Shaw pointed out that the word “fish” might be spelled ghoti: gh as in “enough,” o as in “women,” and ti as in “lotion”! Shaw hated the irregularities of English spelling so much that in his will he provided for a contest to design a new and fully rational alphabet called “Shavian.” Unfortunately, it never met with much success, probably because it departed too much from all other existent spelling systems.27
Of course, Shaw’s example is far-fetched: no one would ever read ghoti as “fish,” because the letter “g,” when placed at the beginning of a word, is always pronounced as a hard g or a j, never as an f. Likewise, Shakespeare notwithstanding, in present-day English the letters “alf” at the end of a word are always pronounced af, as in “calf” and “half.” If letters are taken in context, it is often possible to identify some higher-order regularities that simplify the mapping of letters onto sounds. Even then, however, exceptions remain numerous—“has” and “was,” “tough” and “dough,” “flour” and “tour,” “header” and “reader,” “choir” and “chair,” “friend” and “fiend.” For most irregular words, the recovery of pronunciation, far from being the source of word comprehension, seems to depend on its outcome: it is only after we have recognized the identity of the word “dough” that we can recover its sound pattern.
The Hidden Logic of Our Spelling System
One may wonder why English sticks to such a complicated spelling system. Indeed, Italians do not meet with the same problems. Their spelling is transparent: every letter maps onto a single phoneme, with virtually no exceptions. As a result, it only takes a few months to learn to read. This gives Italians an enormous advantage: their children’s reading skills surpass ours by several years, and they do not need to spend hours of schooling a week on dictation and spelling out loud. Furthermore, as we shall discuss later, dyslexia is a much less serious problem for them. Perhaps we should follow Italy’s lead, burn all our dictionaries and desine a noo speling sistem dat eeven a θree-yia-old tchaild cood eezilee reed.
There is no doubt that English spelling could be simplified. The weight of history explains a lot of its peculiarities—today’s pupils should lament the loss of the battle of Hastings, because the mixture of French and English that ensued is responsible for many of our spelling headaches, such as the use of the letter “c” for the sound s (as in “cinder”). Centuries of academic conservatism, sometimes bordering on pedantry, have frozen our dictionary. Well-meaning academics even introduced spelling absurdities such as the “s” in the word “island,” a misguided Renaissance attempt to restore the etymology of the Latin word insula. Worst of all, English spelling failed to evolve in spite of the natural drift of oral language. The introduction of foreign words and spontaneous shifts in English articulation have created an immense gap between the way we write and the way we speak, which causes years of unnecessary suffering for our children. In brief, reason calls for a radical simplification of English spelling.
Nevertheless, before any revisions can be made, it is essential to fully understand the hidden logic of our spelling system. Spelling irregularities are not just a matter of convention. They also originate in the very structure of our language and of our brains. The two reading routes, either from spelling to sound or from spelling to meaning, place complex and often irreconcilable constraints on any writing system. The linguistic differences between English, Italian, French, and Chinese are such that no single spelling solution could ever suit them all. Thus the abominable irregularity of English spelling appears inevitable. Although spelling reform is badly needed, it will have to struggle with a great many restrictions.
First of all, it is not clear that English spelling, like Italian, could attribute a single letter to each sound, and a fixed sound to each letter. It would not be a simple thing to do because the English language contains many more speech sounds than Italian. The number of English phonemes ranges from forty to forty-five, depending on speakers and counting methods, while Italian has only thirty. Vowels and diphthongs are particularly abundant in English: there are six simple vowels (as in bat, bet, bit, but, good, and pot), but also five long vowels (as in beef, boot, bird, bard, and boat) and at least seven diphthongs (as in bay, boy, toe, buy, cow, beer, bear). If each of those sounds were granted its own written symbol, we would have to invent new letters, placing an additional burden on our children. We could consider the addition of accents to existing letters, such as ã, õ, or ü. However, it is entirely utopian to imagine a universal alphabet that could transcribe all of the world’s languages. Such a spelling system does exist: it is called the International Phonetic Alphabet and it plays an important role in technical publications by phonologists and linguists. However, this writing system is so complex that it would be ineffective in everyday life. The International Phonetic Alphabet has 170 signs, some of which are particularly complex . Even specialists find it very hard to read fluently without the help of a dictionary.
In order to avoid learning an excessive number of symbol shapes, languages with a great many phonemes, such as English and French, all resort to a compromise. They indicate certain vowels or consonants using either special characters such as ü, or groups of letters like “oo” or “oy.” Such peculiarities, which are unique to any given language, are far from being gratuitous embellishments: they play an essential role in the “mental economy” of reading, and have to find their place in any kind of spelling reform.
Although we cannot easily assign a single letter shape to each speech sound, we could perhaps try the opposite. Many spelling errors could be avoided if we systematically transcribed each sound with a fixed letter. For instance, if we were to avoid writing the sound f with both the letter “f” and with “ph,” life would be much simpler. There is little doubt that we could easily get rid of this and many other useless redundancies whose acquisition eats up many years of childhood. In fact, this is the timid direction that American spelling reform took when it simplified the irregular British spellings of “behaviour” or “analyse” into “behavior” and “analyze.” Many more steps could have been taken along the same lines. As expert readers, we cease to be aware of the absurdity of our spelling. Even a letter as simple as “x” is unnecessary, as it stands for two phonemes ks that already have their own spelling. In Turkey, one takes a “taksi.” That country, which in the space of one year (1928–29) adopted the Roman alphabet, drastically simplified its spelling, and taught three million people how to read, sets a beautiful example of the feasibility of spelling reform.
Yet here again, great caution is needed. I suspect that any radical reform whose aim would be to ensure a clear, one-to-one transcription of English speech would be bound to fail, because the role of spelling is not just to provide a faithful transcription of speech sounds. Voltaire was mistaken when he stated, elegantly but erroneously, that “writing is the painting of the voice: the more it bears a resemblance, the better.” A written text is not a high-fidelity recording. Its goal is not to reproduce speech as we pronounce it, but rather to code it at a level abstract enough to allow the reader to quickly retrieve its meaning.
For the sake of argument, we can try to imagine what a purely phonetic writing system would look like—one that Voltaire might have considered ideal. When we speak, we alter the pronunciation of words as a function of the sounds that surround them. It would be disastrous if spelling were to reflect the obtuse linguistic phenomena of so-called coarticulation, assimilation, and resyllabification, of which most speakers are usually unaware. A matter of context would end up having the same word spelled differently. Should we, for instance, use distinct marks for the various pronunciations of plurals? Should we spell “cap driver,” under the pretext that the sound b, when followed by a d, tends to be pronounced like a p? At one extreme, should we factor in the speaker’s accent (“Do you take me vor a shicken?”). This would be apsurd (yes, we do pronounce this word with a p sound). The prime goal of writing is to transmit meaning as efficiently as possible. Any servile transcription of sounds would detract from this aim.
English spelling often privileges the transparency of word roots at the expense of the regularity of sounds. The words “insane” and “insanity,” for instance, are so deeply related to their meaning that it would be silly to spell them differently because of their slightly different pronunciation. Similarly, it is logical to maintain the silent n at the end of “column,” “autumn,” or “condemn,” given that these words give rise to “columnist,” “autumnal,” or “condemnation.”
Transcription of meaning also explains, at least in part, why English spells the same sounds in many different ways. English words tend to be compact and monosyllabic, and as a result, homophony is very frequent (for instance “eye” and “I,” “you” and “ewe”). If these words were transcribed phonetically, they could not be distinguished from each other. Spelling conventions have evolved with this constraint in mind. Distinctive spelling for the same sounds complicates dictation, but simplifies the task for the reader, who can quickly grasp the intended meaning. Students who complain about the countless forms of spelling the sound u as in “two,” “too,” “to,” or “stew” should understand that these embellishments are essential to the speed at which we read. Without them, any written text would become an opaque rebus. Thanks to spelling conventions, written English points straight at meaning. Any spelling reform would have to maintain this subtle equilibrium between sound and meaning, because this balance reflects a much deeper and more rigid phenomenon: our brain’s two reading routes.
The Impossible Dream of Transparent Spelling
Rivalry between reading for sound and reading for meaning is true the world over. All writing systems must somehow manage to address this problem. Which compromise is best depends on the language to be transcribed. Life would certainly be easier if English spelling were as easy to learn as Italian or German. These languages, however, benefit from a number of characteristics that make them easy to transcribe into writing. In Italian as in German, words tend to be long, often made up of several syllables. Grammatical agreement is well marked by resonant vowels. As a result, homonyms are quite rare. Thus, a purely regular transcription of sounds is feasible. Italian and German can afford a fairly transparent spelling system, where almost every letter corresponds to a unique sound.
At the other end of the continuum, there is the case of Mandarin Chinese. The vast majority of Chinese words consist of only one or two syllables, and because there are only 1,239 syllables (410 if one discounts tonal changes), each one can refer to dozens of different concepts (figure 1.3). Thus a purely phonetic writing system would be useless in Chinese—each of the rebuses could be understood in hundreds of different ways! This is why the thousands of characters in Mandarin script mostly transcribe words, or rather their morphemes—the basic elements of word meaning. Chinese writing also relies on several hundred phonetic markers that further specify how a given root should be pronounced and make it easier for the reader to figure out which word is intended. The character , for instance, which means “mother” and is pronounced ma, consists of the morpheme = “woman,” plus a phonetic marker = mă. Thus, contrary to common belief, even Chinese is not a purely ideographic script (whose symbols represent concepts), nor a logographic one (whose signs refer to single words), but a mixed “morphosyllabic” system where some signs refer to the morphemes of words and others to their pronunciation.28
Of course, it is more difficult to learn to read Chinese than to learn to read Italian. Several thousand signs have to be learned, as opposed to only a few tens. These two languages thus fall at the two extremities of a continuous scale of spelling transparency, where English and French occupy intermediate positions.29 In both English and French, words tend to be short, and homophones are therefore relatively frequent (“right,” “write,” “rite”). To accommodate these constraints, English and French spelling rules include a mixture of phonetic and lexical transcription—a source of difficulty for the writer, but simplicity for the reader.
In brief, we have only just begun to decipher the many constraints that shape the English spelling system. Will we ever be able to reform it? My own personal standpoint on this score is that drastic simplification is inevitable. We owe it to our children, who waste hundreds of hours at this cruel game. Furthermore, some of them may never recover, handicapped for life by dyslexia or simply because they were raised in underprivileged or multilingual families. These are the real victims of our archaic spelling system. My hope is that the next generation will have become so used to abridged spelling, thanks to cell phones and the Internet, that they will cease to treat this issue as taboo and will muster enough willpower to address it rationally. However, the problem will never be resolved by a simple decree that instates phonological spelling. English will never be as simple as Italian. The dream of regular spelling is something of an illusion, as has been pointed out in a pamphlet that has been circulating in Europe for some time:
Figure 1.3 Spelling irregularities are not as irrational as they seem. For instance, although Chinese writing uses up to twenty or thirty different characters for the same syllable, the redundancy is far from pointless. On the contrary, it is very helpful to Chinese readers because the Chinese language includes a great many homophones—words that sound alike but have different meanings, like the English “one” and “won.” Here, an entire Chinese story was written with the sound “shi”! Any Chinese reader can understand this text, which would clearly be impossible if it were transcribed phonetically as “shi shi shi . . .” Chinese characters disambiguate sounds by using distinct characters for distinct meanings. Similarly, homophony explains why English sticks to so many different spellings for the same sounds (“I scream for ice cream”).
The European Union commissioners have announced that agreement has been reached to adopt English as the preferred language for European communications, rather than German, which was the other possibility. As part of the negotiations, the British government conceded that English spelling had some room for improvement and has accepted a five-year phased plan for what will be known as Euro English (Euro for short).
In the first year, “s” will be used instead of the soft “c.” Sertainly, sivil servants will resieve this news with joy. Also, the hard “c” will be replaced with “k.” Not only will this klear up konfusion, but typewriters kan have one less letter. There will be growing publik enthusiasm in the sekond year, when the troublesome “ph” will be replaced by “f.” This will make words like “fotograf” 20 per sent shorter.