Linguistic distance is how different one language or dialect is from another. Although they lack a uniform approach to quantifying linguistic distance between languages, practitioners of linguistics use the concept in a variety of linguistic situations, such as learning additional languages, historical linguistics, language-based conflicts and the effects of language differences on trade.
The proposed measures used for linguisitic distance reflect varying understandings of the term itself. One approach is based on mutual intelligibility, i.e. the ability of speakers of one language to understand the other language. With this, the higher the linguistic distance, the lower is the level of mutual intelligibility.
Because cognate words play an important role in mutual intelligibility between languages, these figure prominently in such analyses. The higher the percentage of cognate (as opposed to non-cognate) words in the two languages with respect to one another, the lower is their linguistic distance. Also, the greater the degree of grammatical relatedness (i.e. the cognates mean roughly similar things) and lexical relatedness (i.e. the cognates are easily discernible as related words), the lower is the linguistic distance. As an example of this, the Hindustani word pānch is grammatically identical and lexically similar (but non-identical) to its cognate Punjabi and Persian word panj as well as to the lexically dissimilar but still grammatically identical Greek pent- and English five. As another example, the English dish and German tisch 'table' are lexically (phonologically) similar but grammatically (semantically) dissimilar. Cognates in related languages can even be identical in form, but semantically distinct, such as caldo and largo, which mean respectively 'hot' and 'wide' in Italian but 'broth, soup' and 'long' in Spanish. Using a statistical approach (called lexicostatistics) by comparing each language's mass of words, distances can be calculated between them. In technical terms, what is calculated is the Levenshtein distance. Based on this, one study compared both Afrikaans and West Frisian with Dutch to see which was closer to Dutch. It determined that the Dutch and Afrikaans (mutual distance of 20.9%) were considerably closer than Dutch and West Frisian (mutual distance of 34.2%).
However, lexicostatistical methods, which are based on retentions from a common proto-language – and not innovations – are problematic due to a number of reasons, so some linguists argue they cannot be relied upon during the tracing of a phylogenetic tree (for example, highest retention rates can sometimes be found in the opposite, peripheral ends of a language family). Unusual innovativeness or conservativeness of a language can distort linguistic distance and the assumed separation date, examples being Romani language and East Baltic languages respectively. On the one hand, continued adjacency of closely related languages after their separation can make some loanwords 'invisible' (indistinguishable from cognates, see etymological nativization), therefore, from lexicostatistical point of view these languages appear less distant then they actually are (examples being Finnic and Saami languages). On the other hand, strong foreign influence of languages spreading far from their homeland can make them share less inherited words than they should do (examples being Hungarian and Samoyedic languages in the East Uralic branch).
Other internal aspects
To overcome the aforementioned problems of the lexicostatistical methods, Donald Ringe, Tandy Warnow and Luay Nakhleh developed a complex phylogenetical method relying on phonological and morphological innovations in 2000s.
A 2004 paper by economists Barry Chiswick and Paul Miller attempted to put forth a metric for linguistic distances that was based on empirical observations of how rapidly speakers of a given language gained proficiency in another one when immersed in a society that overwhelmingly communicated in the latter language. In this study, the speed of English language acquisition was studied for immigrants of various linguistic backgrounds in the United States and Canada.[vague]
- Colin Renfrew; April M. S. McMahon; Robert Lawrence Trask (2000), Time depth in historical linguistics, McDonald Institute for Archaeological Research, 2000, ISBN 978-1-902937-06-9,
... The term 'linguistic distance' is often used to refer to the degree of similarity/ difference between any two language varieties ...
- Li Wei (2000), The bilingualism reader, Psychology Press, 2000, ISBN 978-0-415-21336-3,
... linguistic distance is a notion which still remains problematic (for a discussion, see Hinskens, 1988), it does seem possible to place languages along a continuum based on formal characteristics such as the number of cognates in languages or sets of shared syntactic characteristics ...
- Michael H. Long (2009-07-15), The Handbook of Language Teaching, John Wiley and Sons, 2009, ISBN 978-1-4051-5489-5,
... findings from work on linguistic transfer, typology and 'linguistic distance' ... two related issues arise in these studies: typological distance/phylogenetic relatedness and transfer ... Spanish-Basque bilinguals learning English demonstrated a stronger influence from Spanish, typologically a closer language ...
- Terry Crowley; Claire Bowern (2010-03-04), An Introduction to Historical Linguistics, Oxford University Press US, 2009, ISBN 978-0-19-536554-2,
... Methods that hypothesize relationships in this way are called distance-based methods because they infer the historical relationships from the linguistic distance between languages. Lexicostatistics is a commonly used distance-based ...
- North-western European language evolution: NOWELE, Issues 27-29, Odense University Press, 1996, 1996,
... The main reason for the rapid language shift is said to be the lack of linguistic 'distance' between the two codes (both of them being Germanic and therefore genetically closely related) ...
- Marshall B. Reinsdorf; Matthew Jon Slaughter (2009-08-01), International trade in services and intangibles in the era of globalization, University of Chicago Press, 2009, ISBN 978-0-226-70959-8,
... We measure cultural trade costs between the United States and its trading partners using indicators of the linguistic distance between English and other countries' primary languages ...
- Jeffrey A. Frankel; Ernesto Stein; Shang-Jin Wei (1997), Regional trading blocs in the world economic system, Peterson Institute, 1997, ISBN 978-0-88132-202-6,
... The implication is that two countries sharing linguistic/colonial links tend to trade roughly 55 percent more than they would ... a new measure of linguistic distance that is a continuous scalar rather than a discrete dummy variable ...
- William Hernandez Requejo; John L. Graham (2008-03-04), Global negotiation: the new rules, Macmillan, 2008, ISBN 978-1-4039-8493-7,
... Linguisitic distance has been shown to be an important factor in determining the amount of trade between countries ... 'wider' language differences increases transaction costs and makes trade and negotiations less efficient ...
Jyotirindra Dasgupta, University of California, Berkeley. Center for South and Southeast Asia Studies (1970-01-01), Language conflict and national development: group politics and national language policy in India, University of California Press, 1970, ISBN 978-0-520-01590-6,
... The linguistic distance between East and West Pakistan has therefore tended to increase ...CS1 maint: Multiple names: authors list (link)
- Jan D. ten Thije; Ludger Zeevaert (2007-01-01), Receptive multilingualism: linguistic analyses, language policies, and didactic concepts, John Benjamins Publishing Company, 2007, ISBN 978-90-272-1926-8,
... Assuming that intelligibility is inversely related to linguistic distance ... the content words the percentage of cognates (related directly or via a synonym) ... lexical relatedness ... grammatical relatedness ...
- List of Greek and Latin roots in English#P
- Chiswick, B. R.; Miller, P. W. (2005). "Linguistic Distance: A Quantitative Measure of the Distance Between English and Other Languages". Journal of Multilingual and Multicultural Development. 26: 1–11. doi:10.1080/14790710508668395. hdl:10419/20510. ... vocabulary, grammar, written form, syntax and myriad other statistics ... this scalar measure of "linguisitic distance" is demonstrated through an analysis of the determinants of English language proficiency among immigrants ...