Origin of speech

Hypoglossal nerve
Hypoglossal nerve
	Hypoglossal nerve, cervical plexus, and their branches
Details
Identifiers
Latin	nervus hypoglossus
	Anatomical terms of neuroanatomy [edit on Wikidata]

The origin of speech differs from the origin of language because language is not necessarily spoken; it could equally be written or signed. Speech is a fundamental aspect of human communication and plays a vital role in the everyday lives of humans. It allows them to convey thoughts, emotions, and ideas, and providing the ability to connect with others and shape collective reality.^[1]^[2]

Many attempts have been made to explain scientifically how speech emerged in humans, although to date no theory has generated agreement.

Non-human primates, like many other animals, have evolved specialized mechanisms for producing sounds for purposes of social communication.^[3] On the other hand, no monkey or ape uses its tongue for such purposes.^[4]^[5] The human species' unprecedented use of the tongue, lips and other moveable parts seems to place speech in a quite separate category, making its evolutionary emergence an intriguing theoretical challenge in the eyes of many scholars.^[6]

Modality-independence[edit]

The term modality means the chosen representational format for encoding and transmitting information. A striking feature of language is that it is modality-independent. Should an impaired child be prevented from hearing or producing sound, its innate capacity to master a language may equally find expression in signing. Sign languages of the deaf are independently invented and have all the major properties of spoken language except for the modality of transmission.^[7]^[8]^[9]^[10] From this it appears that the language centres of the human brain must have evolved to function optimally, irrespective of the selected modality.

"The detachment from modality-specific inputs may represent a substantial change in neural organization, one that affects not only imitation but also communication; only humans can lose one modality (e.g. hearing) and make up for this deficit by communicating with complete competence in a different modality (i.e. signing)."
— Marc Hauser, Noam Chomsky, and W. Tecumseh Fitch, 2002. The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?^[11]

Animal communication systems routinely combine visible with audible properties and effects, but none is modality-independent. For example, no vocally-impaired whale, dolphin, or songbird could express its song repertoire equally in visual display. Indeed, in the case of animal communication, message and modality are not capable of being disentangled. Whatever message is being conveyed stems from the intrinsic properties of the signal.

Modality independence should not be confused with the ordinary phenomenon of multimodality. Monkeys and apes rely on a repertoire of species-specific "gesture-calls" – emotionally-expressive vocalisations inseparable from the visual displays which accompany them.^[12]^[13] Humans also have species-specific gesture-calls – laughs, cries, sobs, etc. – together with involuntary gestures accompanying speech.^[14]^[15]^[16] Many animal displays are polymodal in that each appears designed to exploit multiple channels simultaneously.

The human linguistic property of modality independence is conceptually distinct from polymodality. It allows the speaker to encode the informational content of a message in a single channel whilst switching between channels as necessary. Modern city-dwellers switch effortlessly between the spoken word and writing in its various forms – handwriting, typing, email, etc. Whichever modality is chosen, it can reliably transmit the full message content without external assistance of any kind. When talking on the telephone, for example, any accompanying facial or manual gestures, however natural to the speaker, are not strictly necessary. When typing or manually signing, conversely, there is no need to add sounds. In many Australian Aboriginal cultures, a section of the population – perhaps women observing a ritual taboo – traditionally restrict themselves for extended periods to a silent (manually-signed) version of their language.^[17] Then, when released from the taboo, these same individuals resume narrating stories by the fireside or in the dark, switching to pure sound without sacrifice of informational content.

Evolution of the speech organs[edit]

Speaking is the default modality for language in all cultures. Humans' first recourse is to encode their thoughts in sound – a method which depends on sophisticated capacities for controlling the lips, tongue and other components of the vocal apparatus.

The speech organs evolved in the first instance not for speech but for more basic bodily functions such as feeding and breathing. Nonhuman primates have broadly similar organs, but with different neural controls.^[6] Non-human apes use their highly-flexible, maneuverable tongues for eating but not for vocalizing. When an ape is not eating, fine motor control over its tongue is deactivated.^[4]^[5] Either it is performing gymnastics with its tongue or it is vocalising; it cannot perform both activities simultaneously. Since this applies to mammals in general, Homo sapiens are exceptional in harnessing mechanisms designed for respiration and ingestion for the radically different requirements of articulate speech.^[18]

Possible semi-aquatic adaptations[edit]

Recent insights in human evolution – more specifically, human Pleistocene littoral evolution^[19] – may help understand how human speech evolved. One controversial suggestion is that certain pre-adaptations for spoken language evolved during a time when ancestral hominins lived close to river banks and lake shores rich in fatty acids and other brain-specific nutrients. Occasional wading or swimming may also have led to enhanced breath-control (breath-hold diving).

Independent lines of evidence suggest that "archaic" Homo spread intercontinentally along the Indian Ocean shores (they even reached overseas islands such as Flores) where they regularly dived for littoral foods such as shell- and crayfish,^[20] which are extremely rich in brain-specific nutrients, explaining Homo's brain enlargement.^[21] Shallow diving for seafoods requires voluntary airway control, a prerequisite for spoken language. Seafood such as shellfish generally does not require biting and chewing, but stone tool use and suction feeding. This finer control of the oral apparatus was arguably another biological pre-adaptation to human speech, especially for the production of consonants.^[22]

Tongue[edit]

The word "language" derives from the Latin lingua, "tongue". Phoneticians agree that the tongue is the most important speech articulator, followed by the lips. A natural language can be viewed as a particular way of using the tongue to express thought.

The human tongue has an unusual shape. In most mammals, it is a long, flat structure contained largely within the mouth. It is attached at the rear to the hyoid bone, situated below the oral level in the pharynx. In humans, the tongue has an almost circular sagittal (midline) contour, much of it lying vertically down an extended pharynx, where it is attached to a hyoid bone in a lowered position. Partly as a result of this, the horizontal (inside-the-mouth) and vertical (down-the-throat) tubes forming the supralaryngeal vocal tract (SVT) are almost equal in length (whereas in other species, the vertical section is shorter). As humans move their jaws up and down, the tongue can vary the cross-sectional area of each tube independently by about 10:1, altering formant frequencies accordingly. That the tubes are joined at a right angle permits pronunciation of the vowels [i], [u] and [a], which nonhuman primates cannot do.^[23] Even when not performed particularly accurately, in humans the articulatory gymnastics needed to distinguish these vowels yield consistent, distinctive acoustic results, illustrating the quantal^{[clarification needed]} nature of human speech sounds.^[24] It may not be coincidental that [i], [u] and [a] are the most common vowels in the world's languages.^[25] Human tongues are a lot shorter and thinner than other mammals and are composed of a large number of muscles, which helps shape a variety of sounds within the oral cavity. The diversity of sound production is also increased with the human’s ability to open and close the airway, allowing varying amounts of air to exit through the nose. The fine motor movements associated with the tongue and the airway, make humans more capable of producing a wide range of intricate shapes in order to produce sounds at different rates and intensities.^[26]

Lips[edit]

In humans, the lips are important for the production of stops and fricatives, in addition to vowels. Nothing, however, suggests that the lips evolved for those reasons. During primate evolution, a shift from nocturnal to diurnal activity in tarsiers, monkeys and apes (the haplorhines) brought with it an increased reliance on vision at the expense of olfaction. As a result, the snout became reduced and the rhinarium or "wet nose" was lost. The muscles of the face and lips consequently became less constrained, enabling their co-option to serve purposes of facial expression. The lips also became thicker, and the oral cavity hidden behind became smaller.^[26] Hence, according to Ann MacLarnon, "the evolution of mobile, muscular lips, so important to human speech, was the exaptive result of the evolution of diurnality and visual communication in the common ancestor of haplorhines".^[27] It is unclear whether human lips have undergone a more recent adaptation to the specific requirements of speech.

Respiratory control[edit]

Compared with nonhuman primates, humans have significantly enhanced control of breathing, enabling exhalations to be extended and inhalations shortened as we speak. Whilst we are speaking, intercostal and interior abdominal muscles are recruited to expand the thorax and draw air into the lungs, and subsequently to control the release of air as the lungs deflate. The muscles concerned are markedly more innervated in humans than in nonhuman primates.^[28] Evidence from fossil hominins suggests that the necessary enlargement of the vertebral canal, and therefore spinal cord dimensions, may not have occurred in Australopithecus or Homo erectus but was present in the Neanderthals and early modern humans.^[29]^[30]

Larynx[edit]

The larynx or voice box is an organ in the neck housing the vocal folds, which are responsible for phonation. In humans, the larynx is descended, it is positioned lower than in other primates. This is because the evolution of humans to an upright position shifted the head directly above the spinal cord, forcing everything else downward. The repositioning of the larynx resulted in a longer cavity called the pharynx, which is responsible for increasing the range and clarity of the sound being produced. Other primates have almost no pharynx; therefore, their vocal power is significantly lower.^[26] Humans are not unique in this respect: goats, dogs, pigs and tamarins lower the larynx temporarily, to emit loud calls.^[31] Several deer species have a permanently lowered larynx, which may be lowered still further by males during their roaring displays.^[32] Lions, jaguars, cheetahs and domestic cats also do this.^[33] However, laryngeal descent in nonhumans (according to Philip Lieberman) is not accompanied by descent of the hyoid; hence the tongue remains horizontal in the oral cavity, preventing it from acting as a pharyngeal articulator.^[34]

Despite all this, scholars remain divided as to how "special" the human vocal tract really is. It has been shown that the larynx does descend to some extent during development in chimpanzees, followed by hyoidal descent.^[35] As against this, Philip Lieberman points out that only humans have evolved permanent and substantial laryngeal descent in association with hyoidal descent, resulting in a curved tongue and two-tube vocal tract with 1:1 proportions.^{[citation needed]} Uniquely in the human case, simple contact between the epiglottis and velum is no longer possible, disrupting the normal mammalian separation of the respiratory and digestive tracts during swallowing. Since this entails substantial costs – increasing the risk of choking whilst swallowing food – we are forced to ask what benefits might have outweighed those costs. Some claim the clear benefit must have been speech, but other contest this. One objection is that humans are in fact not seriously at risk of choking on food: medical statistics indicate that accidents of this kind are extremely rare.^[36] Another objection is that in the view of most scholars, speech as we know it emerged relatively late in human evolution, roughly contemporaneously with the emergence of Homo sapiens.^[37] A development as complex as the reconfiguration of the human vocal tract would have required much more time, implying an early date of origin. This discrepancy in timescales undermines the idea that human vocal flexibility was initially driven by selection pressures for speech.

At least one orangutan has demonstrated the ability to control the voice box.^[38]

The size exaggeration hypothesis[edit]

To lower the larynx is to increase the length of the vocal tract, in turn lowering formant frequencies so that the voice sounds "deeper" – giving an impression of greater size. John Ohala argued that the function of the lowered larynx in humans, especially males, is probably to enhance threat displays rather than speech itself.^[39] Ohala pointed out that if the lowered larynx were an adaptation for speech, we would expect adult human males to be better adapted in this respect than adult females, whose larynx is considerably less low. In fact, females invariably outperform males in verbal tests, falsifying this whole line of reasoning.^{[citation needed]} William Tecumseh Fitch likewise argues that this was the original selective advantage of laryngeal lowering in humans. Although, according to Fitch, the initial lowering of the larynx in humans had nothing to do with speech, the increased range of possible formant patterns was subsequently co-opted for speech. Size exaggeration remains the sole function of the extreme laryngeal descent observed in male deer. Consistent with the size exaggeration hypothesis, a second descent of the larynx occurs at puberty in humans, although only in males. In response to the objection that the larynx is descended in human females, Fitch suggests that mothers vocalising to protect their infants would also have benefited from this ability.^[40]

Neanderthal speech[edit]

Most specialists credit the Neanderthals with speech abilities not radically different from those of modern Homo sapiens. An indirect line of argument is that their toolmaking and hunting tactics would have been difficult to learn or execute without some kind of speech.^[41] A recent extraction of DNA from Neanderthal bones indicates that Neanderthals had the same version of the FOXP2 gene as modern humans. This gene, mistakenly described as the "grammar gene", plays a role in controlling the orofacial movements which (in modern humans) are involved in speech.^[42]

During the 1970s, it was widely believed that the Neanderthals lacked modern speech capacities.^[43] It was claimed that they possessed a hyoid bone so high up in the vocal tract as to preclude the possibility of producing certain vowel sounds.

The hyoid bone is present in many mammals. It allows a wide range of tongue, pharyngeal and laryngeal movements by bracing these structures alongside each other in order to produce variation.^[44] It is now realised that its lowered position is not unique to Homo sapiens, whilst its relevance to vocal flexibility may have been overstated: although men have a lower larynx, they do not produce a wider range of sounds than women or two-year-old babies. There is no evidence that the larynx position of the Neanderthals impeded the range of vowel sounds they could produce.^[45] The discovery of a modern-looking hyoid bone of a Neanderthal man in the Kebara Cave in Israel led its discoverers to argue that the Neanderthals had a descended larynx, and thus human-like speech capabilities.^[46]^[47] However, other researchers have claimed that the morphology of the hyoid is not indicative of the larynx's position.^[6] It is necessary to take into consideration the skull base, the mandible, the cervical vertebrae and a cranial reference plane.^[48]^[49]

The morphology of the outer and middle ear of Middle Pleistocene hominins from Atapuerca, Spain, believed to be proto-Neanderthal, suggests they had an auditory sensitivity similar to modern humans and very different from chimpanzees. They were probably able to differentiate between many different speech sounds.^[50]

Hypoglossal canal[edit]

The hypoglossal nerve plays an important role in controlling movements of the tongue. In 1998, a research team used the size of the hypoglossal canal in the base of fossil skulls in an attempt to estimate the relative number of nerve fibres, claiming on this basis that Middle Pleistocene hominins and Neanderthals had more fine-tuned tongue control than either Australopithecines or apes.^[51] Subsequently, however, it was demonstrated that hypoglossal canal size and nerve sizes are not correlated,^[52] and it is now accepted that such evidence is uninformative about the timing of human speech evolution.^[53]

Distinctive features theory[edit]

IPA: Vowels

	Front	Central	Back
Close	i y	ɨ ʉ	ɯ u
Near-close	ɪ ʏ		ʊ
Close-mid	e ø	ɘ ɵ	ɤ o
Mid	e̞ ø̞	ə	ɤ̞ o̞
Open-mid	ɛ œ	ɜ ɞ	ʌ ɔ
Near-open	æ	ɐ
Open	a ɶ	ä	ɑ ɒ

Legend: unrounded • rounded

According to one influential school,^[54]^[55] the human vocal apparatus is intrinsically digital on the model of a keyboard or digital computer (see below). Nothing about a chimpanzee's vocal apparatus suggests a digital keyboard, notwithstanding the anatomical and physiological similarities. This poses the question as to when and how, during the course of human evolution, the transition from analog to digital structure and function occurred.

The human supralaryngeal tract is said to be digital in the sense that it is an arrangement of moveable toggles or switches, each of which, at any one time, must be in one state or another. The vocal cords, for example, are either vibrating (producing a sound) or not vibrating (in silent mode). By virtue of simple physics, the corresponding distinctive feature – in this case, "voicing" – cannot be somewhere in between. The options are limited to "off" and "on". Equally digital is the feature known as "nasalisation". At any given moment the soft palate or velum either allows or does not allow sound to resonate in the nasal chamber. In the case of lip and tongue positions, more than two digital states may be allowed.

The theory that speech sounds are composite entities constituted by complexes of binary phonetic features was first advanced in 1938 by the Russian linguist Roman Jakobson.^[56] A prominent early supporter of this approach was Noam Chomsky, who went on to extend it from phonology to language more generally, in particular to the study of syntax and semantics.^[57]^[58]^[59] In his 1965 book, Aspects of the Theory of Syntax,^[60] Chomsky treated semantic concepts as combinations of binary-digital atomic elements explicitly on the model of distinctive features theory. The lexical item "bachelor", on this basis, would be expressed as [+ Human], [+ Male], [- Married].

Supporters of this approach view the vowels and consonants recognised by speakers of a particular language or dialect at a particular time as cultural entities of little scientific interest. From a natural science standpoint, the units which matter are those common to Homo sapiens by virtue of biological nature. By combining the atomic elements or "features" with which all humans are innately equipped, anyone may in principle generate the entire range of vowels and consonants to be found in any of the world's languages, whether past, present or future. The distinctive features are in this sense atomic components of a universal language.

Voicing contrast in English fricatives
Articulation	Voiceless	Voiced
Pronounced with the lower lip against the teeth:	[f] (fan)	[v] (van)
Pronounced with the tongue against the teeth:	[θ] (thin, thigh)	[ð] (then, thy)
Pronounced with the tongue near the gums:	[s] (sip)	[z] (zip)
Pronounced with the tongue bunched up:	[ʃ] (pressure)	[ʒ] (pleasure)

Criticism[edit]

In recent years, the notion of an innate "universal grammar" underlying phonological variation has been called into question. The most comprehensive monograph ever written about speech sounds, The Sounds of the World's Languages, by Peter Ladefoged and Ian Maddieson,^[25] found virtually no basis for the postulation of some small number of fixed, discrete, universal phonetic features. Examining 305 languages, for example, they encountered vowels that were positioned basically everywhere along the articulatory and acoustic continuum. Ladefoged concluded that phonological features are not determined by human nature: "Phonological features are best regarded as artifacts that linguists have devised in order to describe linguistic systems".^[61]

Self-organisation theory[edit]

Self-organisation characterises systems where macroscopic structures are spontaneously formed out of local interactions between the many components of the system.^[62] In self-organised systems, global organisational properties are not to be found at the local level. In colloquial terms, self-organisation is roughly captured by the idea of "bottom-up" (as opposed to "top-down") organisation. Examples of self-organised systems range from ice crystals to galaxy spirals in the inorganic world.

A termite mound (Macrotermitinae) in the Okavango Delta just outside Maun, Botswana

According to many phoneticians, the sounds of language arrange and re-arrange themselves through self-organisation.^[62]^[63]^[64] Speech sounds have both perceptual (how one hears them) and articulatory (how one produces them) properties, all with continuous values. Speakers tend to minimise effort, favouring ease of articulation over clarity. Listeners do the opposite, favouring sounds that are easy to distinguish even if difficult to pronounce. Since speakers and listeners are constantly switching roles, the syllable systems actually found in the world's languages turn out to be a compromise between acoustic distinctiveness on the one hand, and articulatory ease on the other.

Agent-based computer models take the perspective of self-organisation at the level of the speech community or population. The two main paradigms are (1) the iterated learning model and (2) the language game model. Iterated learning focuses on transmission from generation to generation, typically with just one agent in each generation.^[65] In the language game model, a whole population of agents simultaneously produce, perceive and learn language, inventing novel forms when the need arises.^[66]^[67]

Several models have shown how relatively simple peer-to-peer vocal interactions, such as imitation, can spontaneously self-organise a system of sounds shared by the whole population, and different in different populations. For example, models elaborated by Berrah et al. (1996)^[68] and de Boer (2000),^[69] and recently reformulated using Bayesian theory,^[70] showed how a group of individuals playing imitation games can self-organise repertoires of vowel sounds which share substantial properties with human vowel systems. For example, in de Boer's model, initially vowels are generated randomly, but agents learn from each other as they interact repeatedly over time. Agent A chooses a vowel from her repertoire and produces it, inevitably with some noise. Agent B hears this vowel and chooses the closest equivalent from her own repertoire. To check whether this truly matches the original, B produces the vowel she thinks she has heard, whereupon A refers once again to her own repertoire to find the closest equivalent. If this matches the one she initially selected, the game is successful, otherwise, it has failed. "Through repeated interactions", according to de Boer, "vowel systems emerge that are very much like the ones found in human languages".^[71]

In a different model, the phonetician Björn Lindblom^[72] was able to predict, on self-organisational grounds, the favoured choices of vowel systems ranging from three to nine vowels on the basis of a principle of optimal perceptual differentiation.

Further models studied the role of self-organisation in the origins of phonemic coding and combinatoriality, which is the existence of phonemes and their systematic reuse to build structured syllables. Pierre-Yves Oudeyer developed models which showed that basic neural equipment for adaptive holistic vocal imitation, coupling directly motor and perceptual representations in the brain, can generate spontaneously shared combinatorial systems of vocalisations, including phonotactic patterns, in a society of babbling individuals.^[62]^[73] These models also characterised how morphological and physiological innate constraints can interact with these self-organised mechanisms to account for both the formation of statistical regularities and diversity in vocalisation systems.

Gestural theory[edit]

The gestural theory states that speech was a relatively late development, evolving by degrees from a system that was originally gestural. Human ancestors were unable to control their vocalisation at the time when gestures were used to communicate; however, as they slowly began to control their vocalisations, spoken language began to evolve.

Three types of evidence support this theory:

Gestural language and vocal language depend on similar neural systems. The regions on the cortex that are responsible for mouth and hand movements border each other.
Nonhuman primates minimise vocal signals in favour of manual, facial and other visible gestures in order to express simple concepts and communicative intentions in the wild. Some of these gestures resemble those of humans, such as the "begging posture", with the hands stretched out, which humans share with chimpanzees.^[74]
Mirror Neurons^{[clarification needed]}

Research has found strong support for the idea that spoken language and signing depend on similar neural structures. Patients who used sign language, and who suffered from a left-hemisphere lesion, showed the same disorders with their sign language as vocal patients did with their oral language.^[75] Other researchers found that the same left-hemisphere brain regions were active during sign language as during the use of vocal or written language.^[76]

Humans spontaneously use hand and facial gestures when formulating ideas to be conveyed in speech.^[77]^[78] There are also, of course, many sign languages in existence, commonly associated with deaf communities; as noted above, these are equal in complexity, sophistication, and expressive power, to any oral language. The main difference is that the "phonemes" are produced on the outside of the body, articulated with hands, body, and facial expression, rather than inside the body articulated with tongue, teeth, lips, and breathing.

Many psychologists and scientists have looked into the mirror system in the brain to answer this theory as well as other behavioural theories. Evidence to support mirror neurons as a factor in the evolution of speech includes mirror neurons in primates, the success of teaching apes to communicate gesturally, and pointing/gesturing to teach young children language. Fogassi and Ferrari (2014)^{[citation needed]} monitored motor cortex activity in monkeys, specifically area F5 in the Broca’s area, where mirror neurons are located. They observed changes in electrical activity in this area when the monkey executed or observed different hand actions performed by someone else. Broca’s area is a region in the frontal lobe responsible for language production and processing. The discovery of mirror neurons in this region, which fire when an action is done or observed specifically with the hand, strongly supports the belief that communication was once accomplished with gestures. The same is true when teaching young children language. When one points at a specific object or location, mirror neurons in the child fire as though they were doing the action, which results in long-term learning^[79]

Criticism[edit]

Critics note that for mammals in general, sound turns out to be the best medium in which to encode information for transmission over distances at speed. Given the probability that this applied also to early humans, it is hard to see why they should have abandoned this efficient method in favour of more costly and cumbersome systems of visual gesturing – only to return to sound at a later stage.^[80]

By way of explanation, it has been proposed that at a relatively late stage in human evolution, hands became so much in demand for making and using tools that the competing demands of manual gesturing became a hindrance. The transition to spoken language is said to have occurred only at that point.^[81] Since humans throughout evolution have been making and using tools, however, most scholars remain unconvinced by this argument. (For a different approach to this issue – one setting out from considerations of signal reliability and trust – see "from pantomime to speech" below).

Timeline of speech evolution[edit]

Hominin timeline

−10 —

–

−9 —

–

−8 —

–

−7 —

–

−6 —

–

−5 —

–

−4 —

–

−3 —

–

−2 —

–

−1 —

–

0 —

Miocene

Pliocene

Pleistocene

Hominini

Nakalipithecus

Samburupithecus

Ouranopithecus
(Ou. turkae)
(Ou. macedoniensis)

Chororapithecus

Oreopithecus

Sivapithecus

Sahelanthropus

Graecopithecus

Orrorin

(O. praegens)
(O. tugenensis)

Ardipithecus

(Ar. kadabba)

(Ar. ramidus)

Australopithecus
(Au. africanus)
(Au. afarensis)
(Au. anamensis)

H. habilis
(H. rudolfensis)
(Au. garhi)

H. erectus
(H. antecessor)
(H. ergaster)
(Au. sediba)

H. heidelbergensis

Homo sapiens

←

←

←

←

←

Earliest sign of Ardipithecus

←

Earliest sign of Australopithecus

←

Earliest stone tools

←

Earliest sign of
Homo

←

Dispersal beyond Africa

←

Earliest language

←

Earliest fire / cooking

←

Earliest rock art

←

Earliest clothes

←

Modern humans

H

o

m

i

n

i

d

s

P
a
r
a
n
t
h
r
o
p
u
s

(million years ago)

Little is known about the timing of language's emergence in the human species. Unlike writing, speech leaves no material trace, making it archaeologically invisible. Lacking direct linguistic evidence, specialists in human origins have resorted to the study of anatomical features and genes arguably associated with speech production. Whilst such studies may provide information as to whether pre-modern Homo species had speech capacities, it is still unknown whether they actually spoke. Whilst they may have communicated vocally, the anatomical and genetic data lack the resolution necessary to differentiate proto-language from speech.

Using statistical methods to estimate the time required to achieve the current spread and diversity in modern languages today, Johanna Nichols – a linguist at the University of California, Berkeley – argued in 1998 that vocal languages must have begun diversifying at least 100,000 years ago.^[82]

In 2012, anthropologists Charles Perreault and Sarah Mathew used phonemic diversity to suggest a date consistent with this.^[83] "Phonemic diversity" denotes the number of perceptually distinct units of sound – consonants, vowels and tones – in a language. The current worldwide pattern of phonemic diversity potentially contains the statistical signal of the expansion of modern Homo sapiens out of Africa, beginning around 60-70 thousand years ago. Some scholars argue that phonemic diversity evolves slowly and can be used as a clock to calculate how long the oldest African languages would have to have been around in order to accumulate the number of phonemes they possess today. As human populations left Africa and expanded into the rest of the world, they underwent a series of bottlenecks – points at which only a very small population survived to colonise a new continent or region. Allegedly such a population crash led to a corresponding reduction in genetic, phenotypic and phonemic diversity. African languages today have some of the largest phonemic inventories in the world, whilst the smallest inventories are found in South America and Oceania, some of the last regions of the globe to be colonised. For example, Rotokas, a language of New Guinea, and Pirahã, spoken in South America, both have just 11 phonemes,^[84]^[85] whilst !Xun, a language spoken in Southern Africa has 141 phonemes. The authors use a natural experiment – the colonization of mainland Southeast Asia on the one hand, the long-isolated Andaman Islands on the other – to estimate the rate at which phonemic diversity increases through time. Using this rate, they estimate that the world's languages date back to the Middle Stone Age in Africa, sometime between 350 thousand and 150 thousand years ago. This corresponds to the speciation event which gave rise to Homo sapiens.

These and similar studies have however been criticised by linguists who argue that they are based on a flawed analogy between genes and phonemes, since phonemes are frequently transferred laterally between languages unlike genes, and on a flawed sampling of the world's languages, since both Oceania and the Americas also contain languages with very high numbers of phonemes, and Africa contains languages with very few. They argue that the actual distribution of phonemic diversity in the world reflects recent language contact and not deep language history - since it is well demonstrated that languages can lose or gain many phonemes over very short periods. In other words, there is no valid linguistic reason to expect genetic founder effects to influence phonemic diversity.^[86]^[87]

Notes[edit]

^ Oxford English Dictionary, Noam Chomsky.
^ Oxford English Dictionary
^ Kelemen, G. (1963). Comparative anatomy and performance of the vocal organ in vertebrates. In R. Busnel (ed.), Acoustic behavior of animals. Amsterdam: Elsevier, pp. 489–521.
^ ^a ^b Riede, T.; Bronson, E.; Hatzikirou, H.; Zuberbühler, K. (Jan 2005). "Vocal production mechanisms in a non-human primate: morphological data and a model" (PDF). J Hum Evol. 48 (1): 85–96. doi:10.1016/j.jhevol.2004.10.002. PMID 15656937.
^ ^a ^b Riede, T.; Bronson, E.; Hatzikirou, H.; Zuberbühler, K. (February 2006). "Multiple discontinuities in nonhuman vocal tracts – A reply". Journal of Human Evolution. 50 (2): 222–225. doi:10.1016/j.jhevol.2005.10.005.
^ ^a ^b ^c Fitch, W.Tecumseh (July 2000). "The evolution of speech: a comparative review". Trends in Cognitive Sciences. 4 (7): 258–267. CiteSeerX 10.1.1.22.3754. doi:10.1016/S1364-6613(00)01494-7. PMID 10859570. S2CID 14706592.
^ Stokoe, W. C. 1960. Sign language structure: an outline of the communicative systems of the American deaf. Silver Spring, MD: Linstock Press.
^ Bellugi, U. and E. S. Klima 1975. Aspects of sign language and its structure. In J. F. Kavanagh and J. E. Cutting (eds), The Role of Speech in Language. Cambridge, Massachusetts: The MIT Press, pp. 171 203.
^ Hickok, G.; Bellugi, U.; Klima, ES. (Jun 1996). "The neurobiology of sign language and its implications for the neural basis of language". Nature. 381 (6584): 699–702. Bibcode:1996Natur.381..699H. doi:10.1038/381699a0. PMID 8649515. S2CID 27480040.
^ Kegl, Judy; Senghas, Ann; Coppola, Marie (1999). "Creation through Contact: Sign Language Emergence and Sign Language Change in Nicaragua". In Michel DeGraff (ed.). Language creation and language change: creolization, diachrony, and development. Cambridge, Massachusetts: MIT Press. ISBN 978-0-262-04168-3. OCLC 39508250.
^ Hauser, M. D.; Chomsky, N; Fitch, WT (22 November 2002). "The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?". Science. 298 (5598): 1569–1579. doi:10.1126/science.298.5598.1569. PMID 12446899.
^ Goodall, Jane (1986). The chimpanzees of Gombe : patterns of behavior. Cambridge, Massachusetts: Belknap Press of Harvard University Press. ISBN 978-0-674-11649-8. OCLC 12550961.
^ Burling, R (1993). "Primate calls, human language, and nonverbal communication". Current Anthropology. 34: 25–53. doi:10.1086/204132. S2CID 147082731.
^ Darwin. C. 1872. The Expression of the Emotions in Man and Animals. London: Murray.^{[page needed]}
^ Ekman, P. 1982. Emotion in the Human Face, 2nd Edition. Cambridge: Cambridge University Press.^{[page needed]}
^ McNeil, D. 1992. Hand and Mind. What gestures reveal about thought. Chicago and London: University of Chicago Press.^{[page needed]}
^ Kendon, A. 1988. Sign Languages of Aboriginal Australia. Cambridge: Cambridge University Press.^{[page needed]}
^ MacNeilage, Peter, 2008. The Origin of Speech. Oxford: Oxford University Press.^{[page needed]}
^ Rhys-Evans, Peter (29 July 2019). The Waterside Ape. CRC Press. ISBN 9780367145484.
^ Joordens, Josephine C. A. (3 December 2014). "Homo erectus at Trinil on Java used shells for tool production and engraving". Nature. 518 (7538): 228–231. Bibcode:2015Natur.518..228J. doi:10.1038/nature13962. PMID 25470048. S2CID 4461751. Retrieved 11 June 2021.
^ Stephen, Cunnane (2010). Human brain evolution : the influence of freshwater and marine food resources. Hoboken, N.J.: Wiley-Blackwell. ISBN 978-0470452684.
^ Verhaegen, M.; Munro, S. (January 2004). "Possible preadaptations to speech. A preliminary comparative". Human Evolution. 19 (1): 53–70. doi:10.1007/BF02438909. S2CID 86519517. Retrieved 11 June 2021.
^ Lieberman, Philip; Crelin, Edmund S.; Klatt, Dennis H. (June 1972). "Phonetic Ability and Related Anatomy of the Newborn and Adult Human, Neanderthal Man, and the Chimpanzee". American Anthropologist. 74 (3): 287–307. doi:10.1525/aa.1972.74.3.02a00020.
^ Stevens, K. N. 1972. The quantal nature of speech: Evidence from articulatory-acoustic data. In P. B. Denes and E. E. David, Jr. (eds.), Human Communication: A unified view. New York: McGraw-Hill, pp. 51–66.
^ ^a ^b Ladefoged, P. and Maddieson, I. 1996. The Sounds of the World's Languages. Oxford: Blackwell.
^ ^a ^b ^c Yule, George (2014). The Study of Language (PDF). Cambridge University Press. ISBN 9781107044197 – via www.dsglynn.univ-paris8.fr.
^ MacLarnon, A. 2012. The anatomical and physiological basis of human speech production: adaptations and exaptations. In M. Tallerman and K. .Gibson (eds.), The Oxford Handbook of Language Evolution. Oxford: Oxford University Press, pp. 224-235.
^ MacLarnon, A. M. (1993). The vertebral canal. In A. Walker and R. Leakey (eds.), The Nariokotome Homo erectus skeleton. Cambridge, Massachusetts: Harvard University Press, 359–90.
^ MacLarnon AM, Hewitt GP (July 1999). "The evolution of human speech: the role of enhanced breathing control". Am. J. Phys. Anthropol. 109 (3): 341–63. doi:10.1002/(SICI)1096-8644(199907)109:3<341::AID-AJPA5>3.0.CO;2-2. PMID 10407464.
^ MacLarnon, Ann; Hewitt, Gwen (2004). "Increased breathing control: Another factor in the evolution of human language". Evolutionary Anthropology: Issues, News, and Reviews. 13 (5): 181–197. doi:10.1002/evan.20032. S2CID 84625135.
^ Fitch, W.T. (2000). "The phonetic potential of nonhuman vocal tracts: comparative cineradiographic observations of vocalizing animals". Phonetica. 57 (2–4): 205–18. doi:10.1159/000028474. PMID 10992141. S2CID 202652500.
^ Fitch, W.T.; Reby, D. (Aug 2001). "The descended larynx is not uniquely human". Proc Biol Sci. 268 (1477): 1669–75. doi:10.1098/rspb.2001.1704. PMC 1088793. PMID 11506679.
^ Weissengruber, G.E.; Forstenpointner, G.; Peters, G.; Kübber-Heiss, A.; Fitch, W.T. (Sep 2002). "Hyoid apparatus and pharynx in the lion (Panthera leo), jaguar (Panthera onca), tiger (Panthera tigris), cheetah (Acinonyxjubatus) and domestic cat (Felis silvestris f. catus)". J Anat. 201 (3): 195–209. doi:10.1046/j.1469-7580.2002.00088.x. PMC 1570911. PMID 12363272.
^ Lieberman, Philip (2007). "The Evolution of Human Speech: Its Anatomical and Neural Bases" (PDF). Current Anthropology. 48 (1): 39–66. doi:10.1086/509092. S2CID 28651524. Archived from the original (PDF) on 2014-06-11. Retrieved 2012-07-22.
^ Nishimura, T.; Mikami, A.; Suzuki, J.; Matsuzawa, T. (Sep 2006). "Descent of the hyoid in chimpanzees: evolution of face flattening and speech". J Hum Evol. 51 (3): 244–54. doi:10.1016/j.jhevol.2006.03.005. PMID 16730049.
^ M. Clegg 2001. The Comparative Anatomy and Evolution of the Human Vocal Tract Unpublished thesis, University of London.
^ Perreault, C.; Mathew, S. (2012). "Dating the origin of language using phonemic diversity". PLOS ONE. 7 (4): e35289. Bibcode:2012PLoSO...735289P. doi:10.1371/journal.pone.0035289. PMC 3338724. PMID 22558135.
^ Jon Hamilton (14 March 2017). "Orangutan's Vocal Feats Hint At Deeper Roots of Human Speech". NPR.
^ John J. Ohala, 2000. The irrelevance of the lowered larynx in modern Man for the development of speech. Paris, ENST: The Evolution of Language, pp. 171-172.
^ Fitch, W. T. (2002). Comparative vocal production and the evolution of speech: Reinterpreting the descent of the larynx. In A. Wray (ed.), The Transition to Language. Oxford: Oxford University Press, pp. 21–45.
^ Wynn & Coolidge, p.27
^ Wade, Nicholas (19 October 2007). "Neanderthals Had Important Speech Gene, DNA Evidence Shows". The New York Times. Retrieved 18 May 2009.
^ Lieberman, Philip; Crelin, Edmund S. (Spring 1971). "On the Speech of Neanderthal Man" (PDF). Linguistic Inquiry. 2 (2): 203–222. JSTOR 4177625. Archived from the original (PDF) on 2016-03-06. Retrieved 2019-09-03.
^ Nishimura, T.; Mikami, A.; Suzuki, J.; Matsuzawa, T. (Jun 2003). "Descent of the larynx in chimpanzee infants". Proc Natl Acad Sci U S A. 100 (12): 6930–3. Bibcode:2003PNAS..100.6930N. doi:10.1073/pnas.1231107100. PMC 165807. PMID 12775758.
^ Boë, L.J.; et al. (2002). "The potential of Neandertal vowel space was as large as that of modern humans". Journal of Phonetics. 30 (3): 465–484. doi:10.1006/jpho.2002.0170.
^ Arensburg, B.; Schepartz, LA.; Tillier, AM.; Vandermeersch, B.; Rak, Y. (Oct 1990). "A reappraisal of the anatomical basis for speech in Middle Palaeolithic hominids". Am J Phys Anthropol. 83 (2): 137–146. doi:10.1002/ajpa.1330830202. PMID 2248373.
^ Arensburg B, Tillier AM, Vandermeersch B, Duday H, Schepartz LA, Rak Y (April 1989). "A Middle Palaeolithic human hyoid bone". Nature. 338 (6218): 758–60. Bibcode:1989Natur.338..758A. doi:10.1038/338758a0. PMID 2716823. S2CID 4309147.
^ Granat et al., Hyoid bone and larynx in Homo. Estimated position by biometrics, Biom. Hum. et Anthropolol., 2006, 24, 3-4, 243–255.
^ Boë, L.J. et al., Variation and prediction of the hyoid bone position for modern Man and Neanderthal, Biom. Hum. et Anthropolol., 2006, 24, 3-4, 257–271
^ Martínez I.; Rosa M.; Arsuaga J.L.; Jarabo P.; Quam R.; Lorenzo C.; Gracia A.; Carretero J.M.; Bermúdez de Castro J.M.; Carbonell E. (July 2004). "Auditory capacities in Middle Pleistocene humans from the Sierra de Atapuerca in Spain". Proceedings of the National Academy of Sciences. 101 (27): 9976–81. Bibcode:2004PNAS..101.9976M. doi:10.1073/pnas.0403595101. PMC 454200. PMID 15213327.
^ Kay, R. F.; Cartmill, M.; Balow, M. (1998). "The hypoglossal canal and the origin of human vocal behavior". Proceedings of the National Academy of Sciences of the USA. 95 (9): 5417–5419. Bibcode:1998PNAS...95.5417K. doi:10.1073/pnas.95.9.5417. PMC 20276. PMID 9560291.
^ DeGusta, D.; Gilbert, W. H.; Turner, S. P. (1999). "Hypoglossal canal size and hominid speech". Proceedings of the National Academy of Sciences of the USA. 96 (4): 1800–1804. Bibcode:1999PNAS...96.1800D. doi:10.1073/pnas.96.4.1800. PMC 15600. PMID 9990105.
^ Jungers, W. L.; Pokempner, A. A.; Kay, R. F.; Cartmill, M. (2003). "Hypoglossal canal size in living hominoids and the evolution of human speech". Human Biology. 75 (4): 473–484. doi:10.1353/hub.2003.0057. PMID 14655872. S2CID 30777048.
^ Jakobson, R., Gunnar, C., Fant, M. and Halle, M. 1952. Preliminaries to Speech Analysis. Cambridge, Massachusetts: MIT Press.
^ Jakobson, R. and M. Halle 1956. Fundamentals of Language. The Hague: Mouton.
^ Jakobson, R. 1938. "Observations sur le classement phonologiques des consonnes", in Proceedings of the 3rd International Congress of Phonetic Sciences, Ghent.
^ Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton.
^ Chomsky, N. 1964 [1962]. The logical basis of linguistic theory. In H.G. Lunt (ed.), The Proceedings of the Ninth International Congress of Linguists. The Hague: Mouton, pp. 914-77.
^ Chomsky, N. and Halle, M. 1968. The Sound Pattern of English. New York: Harper and Row.
^ Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT Press, pp. 148-192.
^ Ladefoged, P. (2006). "Features and Parameters for Different Purposes" (PDF). Working Papers in Phonetics. 104: 1–13.
^ ^a ^b ^c Oudeyer, Pierre-Yves (2006). Self-organization in the evolution of speech. Oxford University Press; New York: Oxford University Press. ISBN 978-0-19-928915-8. OCLC 65203001.
^ Lindblom, B., MacNeilage, P., and Studdert-Kennedy, M. 1984. Self-organizing processes and the explanation of language universals. In M. Butterworth, B. Comrie and Ö Dahl (eds), Explanations for language universals. Berlin: Walter de Gruyter and Co., pp. 181-203.
^ de Boer, B. 2005b. Self-organization in language. In C. Hemelrijk (ed.), Self-organisation and evolution of social systems. Cambridge: Cambridge University Press, 123–139.
^ Hurford, J.R. (2000). "Social transmission favours linguistic generalization". In Chris Knight; Michael Studdert-Kennedy; James R Hurford (eds.). The Evolutionary emergence of language: social function and the origins of linguistic form. Cambridge; New York: Cambridge University Press. pp. 324–352. ISBN 978-0-521-78157-2. OCLC 807262339.
^ Steels, L (1995). "A self-organizing spatial vocabulary". Artificial Life. 2 (3): 319–332. doi:10.1162/artl.1995.2.319. hdl:10261/127969. PMID 8925502.
^ Steels, L. and Vogt, P. 1997. Grounding adaptive language games in robotic agents. In P. Harvey and P. Husbands (eds.), Proceedings of the 4th European Conference on Artificial Life. Cambridge, Massachusetts: MIT Press, 474–482.
^ Berrah A-R., Glotin H., Laboissière R., Bessière P., Boë L-J. 1996. From Form to Formation of Phonetic Structures: An Evolutionary Computing Pers- pective, in Proc. ICML 1996 Workshop on Evolutionary Computing and Machine Learning, pp. 23-29, Bari, Italy.
^ de Boer, Bart (October 2000). "Self-organization in vowel systems". Journal of Phonetics. 28 (4): 441–465. doi:10.1006/jpho.2000.0125.
^ Moulin-Frier, C.; Laurent, R.; Bessière, P.; Schwartz, J. L.; Diard, J. (September 2012). "Adverse conditions improve distinguishability of auditory, motor, and perceptuo-motor theories of speech perception: An exploratory Bayesian modeling study" (PDF). Language and Cognitive Processes. 27 (7–8): 1240–1263. doi:10.1080/01690965.2011.645313. S2CID 55504109.
^ de Boer, B. 2012. Self-organization and language evolution. In M. Tallerman and K. Gibson (eds), 2012. The Oxford Handbook of Language Evolution. Oxford: Oxford University Press, pp. 612-620.
^ Lindblom. B. 1986. Phonetic universals in vowel systems. In J. J. Ohala and J. J. Jaeger (eds.), Experimental Phonology. Orlando: Academic Press, pp. 13-14.
^ Oudeyer, Pierre-Yves (April 2005). "The self-organization of speech sounds". Journal of Theoretical Biology. 233 (3): 435–449. arXiv:cs/0502086. Bibcode:2005JThBi.233..435O. doi:10.1016/j.jtbi.2004.10.025. PMID 15652151. S2CID 3252482.
^ Premack, David & Premack, Ann James. The Mind of an Ape, ISBN 0-393-01581-5.
^ Kimura, Doreen (1993). Neuromotor Mechanisms in Human Communication. Oxford: Oxford University Press. ISBN 978-0-19-505492-7.
^ Newman, A. J.; et al. (2002). "A Critical Period for Right Hemisphere Recruitment in American Sign Language Processing". Nature Neuroscience. 5 (1): 76–80. doi:10.1038/nn775. PMID 11753419. S2CID 2745545.
^ McNeill, D. 1992. Hand and mind. Chicago, IL: University of Chicago Press.
^ McNeill, D. (ed.) 2000. Language and gesture. Cambridge: Cambridge University Press.
^ "Gestural Theory | Gestures, Speech and Sign Language in Language Evolution". blogs.ntu.edu.sg. Retrieved 2019-03-25.
^ MacNeilage, P. 1998. Evolution of the mechanism of language output: comparative neurobiology of vocal and manual communication. In J. R. Hurford, M. Studdert Kennedy and C. Knight (eds), Approaches to the Evolution of Language. Cambridge: Cambridge University Press, pp. 222 41.
^ Corballis, M. C. 2002. Did language evolve from manual gestures? In A. Wray (ed.), The Transition to Language. Oxford: Oxford University Press, pp. 161-179.
^ Johanna Nichols, 1998. The origin and dispersal of languages: Linguistic evidence. In Nina Jablonski and Leslie C. Aiello, eds., The Origin and Diversification of Language, pp. 127-70. (Memoirs of the California Academy of Sciences, 24.) San Francisco: California Academy of Sciences.
^ Perreault, C.; Mathew, S. (2012). "Dating the origin of language using phonemic diversity". PLOS ONE. 7 (4): e35289. Bibcode:2012PLoSO...735289P. doi:10.1371/journal.pone.0035289. PMC 3338724. PMID 22558135.
^ Maddieson, I. 1984. Patterns of Sounds. Cambridge: Cambridge University Press.
^ Maddieson, I.; Precoda, K. (1990). "Updating UPSID". UCLA Working Papers in Phonetics. 74: 104–111.
^ Hunley, K.; Bowern, C.; Healy, M. (1 February 2012). "Rejection of a serial founder effects model of genetic and linguistic coevolution". Proceedings of the Royal Society B: Biological Sciences. 279 (1736): 2281–2288. doi:10.1098/rspb.2011.2296. PMC 3321699. PMID 22298843.
^ Bowern, Claire (January 2011). "Out of Africa? The logic of phoneme inventories and founder effects". Linguistic Typology. 15 (2). doi:10.1515/lity.2011.015. hdl:1885/28291. S2CID 120276963.

External links[edit]

Interactive sagittal section
Design features of speech Archived 2012-07-22 at the Wayback Machine
Evolution of speech (anatomical and neural bases). Archived 2014-06-11 at the Wayback Machine
Ritual and the origins of language.^{[permanent dead link]}
Decoding Chomsky^{[permanent dead link]}

[1] Oxford English Dictionary, Noam Chomsky.

[2] Oxford English Dictionary

[3] Kelemen, G. (1963). Comparative anatomy and performance of the vocal organ in vertebrates. In R. Busnel (ed.), Acoustic behavior of animals. Amsterdam: Elsevier, pp. 489–521.

[Riede_2005-4] Riede, T.; Bronson, E.; Hatzikirou, H.; Zuberbühler, K. (Jan 2005). "Vocal production mechanisms in a non-human primate: morphological data and a model" (PDF). J Hum Evol. 48 (1): 85–96. doi:10.1016/j.jhevol.2004.10.002. PMID 15656937.

[Riede_Bronson_2006-5] Riede, T.; Bronson, E.; Hatzikirou, H.; Zuberbühler, K. (February 2006). "Multiple discontinuities in nonhuman vocal tracts – A reply". Journal of Human Evolution. 50 (2): 222–225. doi:10.1016/j.jhevol.2005.10.005.

[Fitch_2000-6] Fitch, W.Tecumseh (July 2000). "The evolution of speech: a comparative review". Trends in Cognitive Sciences. 4 (7): 258–267. CiteSeerX 10.1.1.22.3754. doi:10.1016/S1364-6613(00)01494-7. PMID 10859570. S2CID 14706592.

[7] Stokoe, W. C. 1960. Sign language structure: an outline of the communicative systems of the American deaf. Silver Spring, MD: Linstock Press.

[8] Bellugi, U. and E. S. Klima 1975. Aspects of sign language and its structure. In J. F. Kavanagh and J. E. Cutting (eds), The Role of Speech in Language. Cambridge, Massachusetts: The MIT Press, pp. 171 203.

[Hickok_1996-9] Hickok, G.; Bellugi, U.; Klima, ES. (Jun 1996). "The neurobiology of sign language and its implications for the neural basis of language". Nature. 381 (6584): 699–702. Bibcode:1996Natur.381..699H. doi:10.1038/381699a0. PMID 8649515. S2CID 27480040.

[Kegl_1999-10] Kegl, Judy; Senghas, Ann; Coppola, Marie (1999). "Creation through Contact: Sign Language Emergence and Sign Language Change in Nicaragua". In Michel DeGraff (ed.). Language creation and language change: creolization, diachrony, and development. Cambridge, Massachusetts: MIT Press. ISBN 978-0-262-04168-3. OCLC 39508250.

[Hauser_2002-11] Hauser, M. D.; Chomsky, N; Fitch, WT (22 November 2002). "The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?". Science. 298 (5598): 1569–1579. doi:10.1126/science.298.5598.1569. PMID 12446899.

[Goodall_1986-12] Goodall, Jane (1986). The chimpanzees of Gombe : patterns of behavior. Cambridge, Massachusetts: Belknap Press of Harvard University Press. ISBN 978-0-674-11649-8. OCLC 12550961.

[13] Burling, R (1993). "Primate calls, human language, and nonverbal communication". Current Anthropology. 34: 25–53. doi:10.1086/204132. S2CID 147082731.

[14] Darwin. C. 1872. The Expression of the Emotions in Man and Animals. London: Murray.^{[page needed]}

[15] Ekman, P. 1982. Emotion in the Human Face, 2nd Edition. Cambridge: Cambridge University Press.^{[page needed]}

[16] McNeil, D. 1992. Hand and Mind. What gestures reveal about thought. Chicago and London: University of Chicago Press.^{[page needed]}

[17] Kendon, A. 1988. Sign Languages of Aboriginal Australia. Cambridge: Cambridge University Press.^{[page needed]}

[18] MacNeilage, Peter, 2008. The Origin of Speech. Oxford: Oxford University Press.^{[page needed]}

[Rhys-Evans_Waterside_Ape-19] Rhys-Evans, Peter (29 July 2019). The Waterside Ape. CRC Press. ISBN 9780367145484.

[joordens_et_al-20] Joordens, Josephine C. A. (3 December 2014). "Homo erectus at Trinil on Java used shells for tool production and engraving". Nature. 518 (7538): 228–231. Bibcode:2015Natur.518..228J. doi:10.1038/nature13962. PMID 25470048. S2CID 4461751. Retrieved 11 June 2021.

[cunnane_2010-21] Stephen, Cunnane (2010). Human brain evolution : the influence of freshwater and marine food resources. Hoboken, N.J.: Wiley-Blackwell. ISBN 978-0470452684.

[verhaegen_munro_2004-22] Verhaegen, M.; Munro, S. (January 2004). "Possible preadaptations to speech. A preliminary comparative". Human Evolution. 19 (1): 53–70. doi:10.1007/BF02438909. S2CID 86519517. Retrieved 11 June 2021.

[23] Lieberman, Philip; Crelin, Edmund S.; Klatt, Dennis H. (June 1972). "Phonetic Ability and Related Anatomy of the Newborn and Adult Human, Neanderthal Man, and the Chimpanzee". American Anthropologist. 74 (3): 287–307. doi:10.1525/aa.1972.74.3.02a00020.

[24] Stevens, K. N. 1972. The quantal nature of speech: Evidence from articulatory-acoustic data. In P. B. Denes and E. E. David, Jr. (eds.), Human Communication: A unified view. New York: McGraw-Hill, pp. 51–66.

[:2-25] Ladefoged, P. and Maddieson, I. 1996. The Sounds of the World's Languages. Oxford: Blackwell.

[:0-26] Yule, George (2014). The Study of Language (PDF). Cambridge University Press. ISBN 9781107044197 – via www.dsglynn.univ-paris8.fr.

[27] MacLarnon, A. 2012. The anatomical and physiological basis of human speech production: adaptations and exaptations. In M. Tallerman and K. .Gibson (eds.), The Oxford Handbook of Language Evolution. Oxford: Oxford University Press, pp. 224-235.

[28] MacLarnon, A. M. (1993). The vertebral canal. In A. Walker and R. Leakey (eds.), The Nariokotome Homo erectus skeleton. Cambridge, Massachusetts: Harvard University Press, 359–90.

[MacLarnon_1999-29] MacLarnon AM, Hewitt GP (July 1999). "The evolution of human speech: the role of enhanced breathing control". Am. J. Phys. Anthropol. 109 (3): 341–63. doi:10.1002/(SICI)1096-8644(199907)109:3<341::AID-AJPA5>3.0.CO;2-2. PMID 10407464.

[Maclarnon_Hewitt_2004-30] MacLarnon, Ann; Hewitt, Gwen (2004). "Increased breathing control: Another factor in the evolution of human language". Evolutionary Anthropology: Issues, News, and Reviews. 13 (5): 181–197. doi:10.1002/evan.20032. S2CID 84625135.

[Fitch._WT_2000-31] Fitch, W.T. (2000). "The phonetic potential of nonhuman vocal tracts: comparative cineradiographic observations of vocalizing animals". Phonetica. 57 (2–4): 205–18. doi:10.1159/000028474. PMID 10992141. S2CID 202652500.

[Fitch_2001-32] Fitch, W.T.; Reby, D. (Aug 2001). "The descended larynx is not uniquely human". Proc Biol Sci. 268 (1477): 1669–75. doi:10.1098/rspb.2001.1704. PMC 1088793. PMID 11506679.

[Weissengruber_2002-33] Weissengruber, G.E.; Forstenpointner, G.; Peters, G.; Kübber-Heiss, A.; Fitch, W.T. (Sep 2002). "Hyoid apparatus and pharynx in the lion (Panthera leo), jaguar (Panthera onca), tiger (Panthera tigris), cheetah (Acinonyxjubatus) and domestic cat (Felis silvestris f. catus)". J Anat. 201 (3): 195–209. doi:10.1046/j.1469-7580.2002.00088.x. PMC 1570911. PMID 12363272.

[Lieberman_2007-34] Lieberman, Philip (2007). "The Evolution of Human Speech: Its Anatomical and Neural Bases" (PDF). Current Anthropology. 48 (1): 39–66. doi:10.1086/509092. S2CID 28651524. Archived from the original (PDF) on 2014-06-11. Retrieved 2012-07-22.

[Nishimura_2006-35] Nishimura, T.; Mikami, A.; Suzuki, J.; Matsuzawa, T. (Sep 2006). "Descent of the hyoid in chimpanzees: evolution of face flattening and speech". J Hum Evol. 51 (3): 244–54. doi:10.1016/j.jhevol.2006.03.005. PMID 16730049.

[36] M. Clegg 2001. The Comparative Anatomy and Evolution of the Human Vocal Tract Unpublished thesis, University of London.

[Perreault_2012-37] Perreault, C.; Mathew, S. (2012). "Dating the origin of language using phonemic diversity". PLOS ONE. 7 (4): e35289. Bibcode:2012PLoSO...735289P. doi:10.1371/journal.pone.0035289. PMC 3338724. PMID 22558135.

[38] Jon Hamilton (14 March 2017). "Orangutan's Vocal Feats Hint At Deeper Roots of Human Speech". NPR.

[39] John J. Ohala, 2000. The irrelevance of the lowered larynx in modern Man for the development of speech. Paris, ENST: The Evolution of Language, pp. 171-172.

[40] Fitch, W. T. (2002). Comparative vocal production and the evolution of speech: Reinterpreting the descent of the larynx. In A. Wray (ed.), The Transition to Language. Oxford: Oxford University Press, pp. 21–45.

[41] Wynn & Coolidge, p.27

[42] Wade, Nicholas (19 October 2007). "Neanderthals Had Important Speech Gene, DNA Evidence Shows". The New York Times. Retrieved 18 May 2009.

[43] Lieberman, Philip; Crelin, Edmund S. (Spring 1971). "On the Speech of Neanderthal Man" (PDF). Linguistic Inquiry. 2 (2): 203–222. JSTOR 4177625. Archived from the original (PDF) on 2016-03-06. Retrieved 2019-09-03.

[Nishimura_2003-44] Nishimura, T.; Mikami, A.; Suzuki, J.; Matsuzawa, T. (Jun 2003). "Descent of the larynx in chimpanzee infants". Proc Natl Acad Sci U S A. 100 (12): 6930–3. Bibcode:2003PNAS..100.6930N. doi:10.1073/pnas.1231107100. PMC 165807. PMID 12775758.

[45] Boë, L.J.; et al. (2002). "The potential of Neandertal vowel space was as large as that of modern humans". Journal of Phonetics. 30 (3): 465–484. doi:10.1006/jpho.2002.0170.

[Arensburg_1990-46] Arensburg, B.; Schepartz, LA.; Tillier, AM.; Vandermeersch, B.; Rak, Y. (Oct 1990). "A reappraisal of the anatomical basis for speech in Middle Palaeolithic hominids". Am J Phys Anthropol. 83 (2): 137–146. doi:10.1002/ajpa.1330830202. PMID 2248373.

[47] Arensburg B, Tillier AM, Vandermeersch B, Duday H, Schepartz LA, Rak Y (April 1989). "A Middle Palaeolithic human hyoid bone". Nature. 338 (6218): 758–60. Bibcode:1989Natur.338..758A. doi:10.1038/338758a0. PMID 2716823. S2CID 4309147.

[48] Granat et al., Hyoid bone and larynx in Homo. Estimated position by biometrics, Biom. Hum. et Anthropolol., 2006, 24, 3-4, 243–255.

[49] Boë, L.J. et al., Variation and prediction of the hyoid bone position for modern Man and Neanderthal, Biom. Hum. et Anthropolol., 2006, 24, 3-4, 257–271

[50] Martínez I.; Rosa M.; Arsuaga J.L.; Jarabo P.; Quam R.; Lorenzo C.; Gracia A.; Carretero J.M.; Bermúdez de Castro J.M.; Carbonell E. (July 2004). "Auditory capacities in Middle Pleistocene humans from the Sierra de Atapuerca in Spain". Proceedings of the National Academy of Sciences. 101 (27): 9976–81. Bibcode:2004PNAS..101.9976M. doi:10.1073/pnas.0403595101. PMC 454200. PMID 15213327.

[51] Kay, R. F.; Cartmill, M.; Balow, M. (1998). "The hypoglossal canal and the origin of human vocal behavior". Proceedings of the National Academy of Sciences of the USA. 95 (9): 5417–5419. Bibcode:1998PNAS...95.5417K. doi:10.1073/pnas.95.9.5417. PMC 20276. PMID 9560291.

[52] DeGusta, D.; Gilbert, W. H.; Turner, S. P. (1999). "Hypoglossal canal size and hominid speech". Proceedings of the National Academy of Sciences of the USA. 96 (4): 1800–1804. Bibcode:1999PNAS...96.1800D. doi:10.1073/pnas.96.4.1800. PMC 15600. PMID 9990105.

[53] Jungers, W. L.; Pokempner, A. A.; Kay, R. F.; Cartmill, M. (2003). "Hypoglossal canal size in living hominoids and the evolution of human speech". Human Biology. 75 (4): 473–484. doi:10.1353/hub.2003.0057. PMID 14655872. S2CID 30777048.

[54] Jakobson, R., Gunnar, C., Fant, M. and Halle, M. 1952. Preliminaries to Speech Analysis. Cambridge, Massachusetts: MIT Press.

[55] Jakobson, R. and M. Halle 1956. Fundamentals of Language. The Hague: Mouton.

[56] Jakobson, R. 1938. "Observations sur le classement phonologiques des consonnes", in Proceedings of the 3rd International Congress of Phonetic Sciences, Ghent.

[57] Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton.

[58] Chomsky, N. 1964 [1962]. The logical basis of linguistic theory. In H.G. Lunt (ed.), The Proceedings of the Ninth International Congress of Linguists. The Hague: Mouton, pp. 914-77.

[59] Chomsky, N. and Halle, M. 1968. The Sound Pattern of English. New York: Harper and Row.

[60] Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT Press, pp. 148-192.

[61] Ladefoged, P. (2006). "Features and Parameters for Different Purposes" (PDF). Working Papers in Phonetics. 104: 1–13.

[Oudeyer_2006-62] Oudeyer, Pierre-Yves (2006). Self-organization in the evolution of speech. Oxford University Press; New York: Oxford University Press. ISBN 978-0-19-928915-8. OCLC 65203001.

[63] Lindblom, B., MacNeilage, P., and Studdert-Kennedy, M. 1984. Self-organizing processes and the explanation of language universals. In M. Butterworth, B. Comrie and Ö Dahl (eds), Explanations for language universals. Berlin: Walter de Gruyter and Co., pp. 181-203.

[64] Boer, B. 2005b. Self-organization in language. In C. Hemelrijk (ed.), Self-organisation and evolution of social systems. Cambridge: Cambridge University Press, 123–139.

[Hurford_2000-65] Hurford, J.R. (2000). "Social transmission favours linguistic generalization". In Chris Knight; Michael Studdert-Kennedy; James R Hurford (eds.). The Evolutionary emergence of language: social function and the origins of linguistic form. Cambridge; New York: Cambridge University Press. pp. 324–352. ISBN 978-0-521-78157-2. OCLC 807262339.

[66] Steels, L (1995). "A self-organizing spatial vocabulary". Artificial Life. 2 (3): 319–332. doi:10.1162/artl.1995.2.319. hdl:10261/127969. PMID 8925502.

[67] Steels, L. and Vogt, P. 1997. Grounding adaptive language games in robotic agents. In P. Harvey and P. Husbands (eds.), Proceedings of the 4th European Conference on Artificial Life. Cambridge, Massachusetts: MIT Press, 474–482.

[68] Berrah A-R., Glotin H., Laboissière R., Bessière P., Boë L-J. 1996. From Form to Formation of Phonetic Structures: An Evolutionary Computing Pers- pective, in Proc. ICML 1996 Workshop on Evolutionary Computing and Machine Learning, pp. 23-29, Bari, Italy.

[69] Boer, Bart (October 2000). "Self-organization in vowel systems". Journal of Phonetics. 28 (4): 441–465. doi:10.1006/jpho.2000.0125.

[70] Moulin-Frier, C.; Laurent, R.; Bessière, P.; Schwartz, J. L.; Diard, J. (September 2012). "Adverse conditions improve distinguishability of auditory, motor, and perceptuo-motor theories of speech perception: An exploratory Bayesian modeling study" (PDF). Language and Cognitive Processes. 27 (7–8): 1240–1263. doi:10.1080/01690965.2011.645313. S2CID 55504109.

[71] Boer, B. 2012. Self-organization and language evolution. In M. Tallerman and K. Gibson (eds), 2012. The Oxford Handbook of Language Evolution. Oxford: Oxford University Press, pp. 612-620.

[72] Lindblom. B. 1986. Phonetic universals in vowel systems. In J. J. Ohala and J. J. Jaeger (eds.), Experimental Phonology. Orlando: Academic Press, pp. 13-14.

[73] Oudeyer, Pierre-Yves (April 2005). "The self-organization of speech sounds". Journal of Theoretical Biology. 233 (3): 435–449. arXiv:cs/0502086. Bibcode:2005JThBi.233..435O. doi:10.1016/j.jtbi.2004.10.025. PMID 15652151. S2CID 3252482.

[74] Premack, David & Premack, Ann James. The Mind of an Ape, ISBN 0-393-01581-5.

[75] Kimura, Doreen (1993). Neuromotor Mechanisms in Human Communication. Oxford: Oxford University Press. ISBN 978-0-19-505492-7.

[76] Newman, A. J.; et al. (2002). "A Critical Period for Right Hemisphere Recruitment in American Sign Language Processing". Nature Neuroscience. 5 (1): 76–80. doi:10.1038/nn775. PMID 11753419. S2CID 2745545.

[77] McNeill, D. 1992. Hand and mind. Chicago, IL: University of Chicago Press.

[78] McNeill, D. (ed.) 2000. Language and gesture. Cambridge: Cambridge University Press.

[79] "Gestural Theory | Gestures, Speech and Sign Language in Language Evolution". blogs.ntu.edu.sg. Retrieved 2019-03-25.

[80] MacNeilage, P. 1998. Evolution of the mechanism of language output: comparative neurobiology of vocal and manual communication. In J. R. Hurford, M. Studdert Kennedy and C. Knight (eds), Approaches to the Evolution of Language. Cambridge: Cambridge University Press, pp. 222 41.

[81] Corballis, M. C. 2002. Did language evolve from manual gestures? In A. Wray (ed.), The Transition to Language. Oxford: Oxford University Press, pp. 161-179.

[82] Johanna Nichols, 1998. The origin and dispersal of languages: Linguistic evidence. In Nina Jablonski and Leslie C. Aiello, eds., The Origin and Diversification of Language, pp. 127-70. (Memoirs of the California Academy of Sciences, 24.) San Francisco: California Academy of Sciences.

[83] Perreault, C.; Mathew, S. (2012). "Dating the origin of language using phonemic diversity". PLOS ONE. 7 (4): e35289. Bibcode:2012PLoSO...735289P. doi:10.1371/journal.pone.0035289. PMC 3338724. PMID 22558135.

[84] Maddieson, I. 1984. Patterns of Sounds. Cambridge: Cambridge University Press.

[85] Maddieson, I.; Precoda, K. (1990). "Updating UPSID". UCLA Working Papers in Phonetics. 74: 104–111.

[86] Hunley, K.; Bowern, C.; Healy, M. (1 February 2012). "Rejection of a serial founder effects model of genetic and linguistic coevolution". Proceedings of the Royal Society B: Biological Sciences. 279 (1736): 2281–2288. doi:10.1098/rspb.2011.2296. PMC 3321699. PMID 22298843.

[87] Bowern, Claire (January 2011). "Out of Africa? The logic of phoneme inventories and founder effects". Linguistic Typology. 15 (2). doi:10.1515/lity.2011.015. hdl:1885/28291. S2CID 120276963.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

[56]

[57]

[58]

[59]

[60]

[61]

[62]

[63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

[74]

[75]

[76]

[77]

[78]

[79]

[80]

[81]

[82]

[83]

[84]

[85]

[86]

[87]

Origin of speech

Modality-independence[edit]

Evolution of the speech organs[edit]

Possible semi-aquatic adaptations[edit]

Tongue[edit]

Lips[edit]

Respiratory control[edit]

Larynx[edit]

The size exaggeration hypothesis[edit]

Neanderthal speech[edit]

Hypoglossal canal[edit]

Distinctive features theory[edit]

Criticism[edit]

Self-organisation theory[edit]

Gestural theory[edit]

Criticism[edit]

Timeline of speech evolution[edit]

See also[edit]

Notes[edit]

Further reading[edit]

External links[edit]