Haplogroup R1a

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Ebizur (talk | contribs) at 02:39, 16 February 2010 (→‎Central Eurasian origin proposals). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Haplogroup R1a
Possible time of originmore recent than 18,000 years BP[1]
Possible place of originAsia, probably South Asia. Other possibilities include Central Asia, Middle East, and Eastern Europe.
AncestorR1 (R-M173)
DescendantsR1a1a1 to R1a1a8. R-M458 being the most significant (R1a1a7 in Underhill et al. (2009)).
Defining mutations1. M420 now defines R1a in the broadest sense.[2]
2. Within R1a, SRY1532.2 also known as SRY10831.2, now defines R1a1, previously R1a.
3. M17 and M198 (equivalent to one another) define R1a1a, often referred to as if equal to R1a.
Highest frequenciesParts of Eastern Europe, Scandinavia, Central Asia, Siberia and South Asia. (See List of R1a frequency by population)

Haplogroup R1a refers to a major cluster of human Y-chromosome types and is a frequent topic of discussion in human population genetics and genetic genealogy. Men in the same Y DNA haplogroup share a set of differences, or markers, on their Y chromosome, which distinguish them from men in other haplogroups. The Y chromosome is passed from father to son, and so men in R1a all descend from a shared male line ancestor.

R1a can be viewed as a family tree of male lineages. However one sub-branch, R1a1a, is much more common than the others in all major geographical regions. The mutation that is currently used to define the R1a family most broadly is M420. The recent discovery of M420 resulted in a reorganization of the known family tree of R1a. In particular, this discovery demonstrated that there are R1a lineages which are not in the R1a1 branch leading to R1a1a. These have only been found in Asia, including the area of the Middle East, where R1a1a is relatively uncommon.

R1a1a is particularly common in a large region extending from South Asia and Southern Siberia to Central Europe and Scandinavia.[2] The origin of R1a and R1a1a is uncertain. However they are believed to have originated somewhere within Eurasia, most likely in the area from Eastern Europe to South Asia. The most recent studies indicate that Asia is a more likely region of origin than Europe.

Phylogeny (Family Tree)

Publications in 2009 created major changes in the scientific understanding of R1a.[2][3][4][5][6] In particular, the discovery of the mutation M420 has defined a previously unidentified category of lineages, which are distant relatives of R1a1a, which is the most common and well-known variant of R1a. R1a1a is therefore now known to represent only one sub-branch of a bigger "family tree", and itself has many branches which have already been defined. Each major branching of this tree is identified by a corresponding set of known SNP mutations which are used to test individuals, and define the relationships between branches of descent.

Roots of R1a

Haplogroup R family tree
 
 Haplogroup R  
  Haplogroup R1  
M173
  M420 

  R1a

  M343 

 R1b

?

R1*

 Haplogroup R2

R1a evolved from a male-line ancestor who was in haplogroup R1. R1 is defined by SNP mutation M173. The R1a clade can be distinguished by several unique markers including M420. R1a also has a similarly common sister-clade, called R1b, which also has M173, but is distinguished by its M343 marker. There is no simple consensus concerning the places in Eurasia where R1, R1a or R1b evolved, although Underhill et al. (2009) recently suggested that "the most distantly related R1a chromosomes [...] have been detected at low frequency in Europe, Turkey, United Arab Emirates, Caucasus and Iran" implying that R1a's origin may perhaps be somewhere in or near these regions. The origins of the most common type of R1a, R1a1a, will be discussed separately below.

Different meanings of "R1a"

Contrasting family trees for R1a
Scheme proposed in YCC (2002)
R1
 M173  
R1b
M343

 sibling clade to R1a

R1a
 SRY1532.2 
  (SRY10831.2)  

R1a* 

 
R1a1
 M17, M198 

 R1a1*

 M56 

 R1a1a

 M157 

 R1a1b

 M87, M204
M64.2

 
 R1a1c

R1*

 All cases without M343 or SRY1532.2 (including a minority M420+ cases)

As M420 went undetected, M420 lineages were classified as either R1* or R1a (SRY1532.2 [SRY10831.2])
2009 as per Underhill et al. (2009)
R1a 
M420 
R1a1 
SRY1532.2 

  R1a1*

 R1a1a 
 M17, M198 

R1a1a *

M56
 

R1a1a1

M157
 

R1a1a2

 M64.2,..
 

R1a1a3

P98
 

R1a1a4

PK5
 

R1a1a5

M434
 

R1a1a6

 M458 
 
 
M334 
 

 R1a1a7a

 R1a1a7*

 Page68[7]
 

R1a1a8

  R1a*

A new layer is inserted covering all old R1a, plus its closest known relatives

In 2002, a new naming system for haplogroups was proposed by the Y chromosome consortium (YCC), which has now become standard.[8] According to this system, R1 and R1a are "phylogenetic" names, names designed to show a position in a family tree. Names of SNP mutations are also used to name clades or haplogroups. For example, the mutation called "M173" currently identifies R1. Thus R1 can also be called "R-M173". There are also "paragroups", which have an unknown number of branches. For example, men within "Haplogroup R1*" have the M173 marker, but they have no known defining mutations other than those that identify R1 (for example, neither M420 nor M343). When a new branching in a tree is discovered, for example a branch within a paragroup, phylogenetic names need to change. However, by definition, the "mutational" clade names remain the same.

The naming system commonly used for R1a remains inconsistent in different published sources, and requires some explanation. The term "R1a" was originally proposed in YCC (2002) to replace older naming systems such as "Eu19" as used in Semino et al. (2000). Eu19 was a haplogroup defined by mutation M17. The 2002 Y chromosome consortium naming system defined a bigger "R1a", distinguished by mutation SRY1532.2 but including Eu19 as the main part of it, now under the branch name of "R1a1".[9] A still broader family tree for R1a was proposed in 2009 and appeared in published surveys in late 2009.[2][10] This is now defined by the M420 mutation, but once again is mainly made up of the original Eu19. In this newer system, the clade defined by SRY1532.2 moves from "R1a" to "R1a1", and "R1a1" (Eu19 or R-M17) becomes "R1a1a".

The R1a family tree as a whole can therefore now be divided into three major levels of branching, with the largest number of currently defined sub-clades within the dominant and best known branch, R1a1a. The following summary is based upon the large survey of Underhill et al. (2009) as described in sections below and comparing the old and new naming systems.

R1a (R-M420)

This is the broadest definition of R1a, defined by the mutation M420. It is known to have at least two branches: R1a1 (see next section), which makes up the vast majority, and R1a*, the paragroup. In this newest definition, R1a* is defined as M420 positive but SRY1532.2 negative. (Within the pre-2009 scheme, the minority of M420 positive cases lacking SRY1532.2 -ie R1a* post 2009 - would have been classified within R1* (R-M173*). See cladogram on right.)

Underhill et al. (2009) found only isolated samples of the paragroup R1a*, apparently mostly in the Middle East and Caucasus: 1/121 Omanis, 2/150 Iranians, 1/164 in the United Arab Emirates, 3/612 in Turkey. Testing of 7224 more males in 73 other Eurasian populations showed no sign of this category so far. In another study of Jordan by Flores et al. (2005) however, it was found that no less than 20 out of all 146 men tested (13.7%), including most notably 20 out of 45 men tested from the Dead Sea area, were positive for M173 (R1) but negative for SRY10831.2 and M17, mentioned below as key R1a markers, as well as the R1b markers P25 and M269. This makes it likely that these men were either R1a* (R-M420*) or else R1b* (R-M343*). Mutations understood to be equivalent to M420 include M449, M511, M513, L62, and L63.[2][11]

R1a1 (R-SRY1532.2)

R1a1 is currently defined by SRY1532.2, also referred to as SRY10831.2. This family of lineages is dominated by one very large and well-defined R1a1a branch, which is positive for M17 and M198 (see below). The paragroup R1a1* (old R1a*) is positive for the SRY1532.2 marker but lacks either the M17 or M198 markers.

Underhill et al. (2009) again found only limited examples of the R1a1* paragroup, looking at many different surveys. However it does appear to be spread over a wider geographical range in Eurasia than the R1a* paragroup discussed above: 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 Iranians, 2/734 Ethnic Armenians, 1/141 Kabardians. Sharma et al. (2009) also found 2/51 amongst Kashmir Pandits and 13/57 people tested from the Saharia tribe of Madhya Pradesh, which is the highest level in one locality found so far. SNP mutations understood to be always occurring with SRY1532.2 include M448, M459, and M516.[2]

R1a1a (R-M17 or R-M198)

R1a1a (old R1a1) makes up the vast majority of all R1a, over its entire geographic range, and most statistical or other analysis of R1a is therefore by definition focused upon it. It is defined in various articles by SNP mutations M17 or M198, which always appear together in the same men so far. SNP mutations understood to be always occurring with M17 and M198 include M417, M512, M514, M515.[2] The vast majority of R1a1a has not yet been categorized into branches defined by mutations, and is therefore in a paragroup referred to as R1a1a*. However, R1a1a also has several known sub-clades of its own. So far, eight sub-clades of R1a1a are defined.

R1a1a subclades

Frequency distribution of R1a1a7 (R-M458)

So far, 8 SNP-defined sub-clades of R1a1a are known, R1a1a1 to R1a1a8. Currently, only one of these subclades is known to have significant frequencies, R1a1a7. R1a1a7 is defined by M458 and was first described in Underhill et al. (2009). M458 is found almost entirely in Europe, though spreading into Turkey and parts of the Caucasus. Its highest frequencies are in Central and Southern Poland, particularly near the river valleys flowing northwards to the Baltic sea.

R1a1a7 has its own SNP-defined subclade, defined by the M334 marker, also announced in Underhill et al. (2009). However this mutation was only found in one Estonian man and may define a very recently founded and small clade.

Relative frequency of R1a1a6 (R-M434) to R1a1a (R-M17)
Region People N R1a1a-M17 R1a1a6-M434
Number Freq. (%) Number Freq. (%)
 Pakistan  Baloch 60 9 15% 5 8%
 Pakistan  Makrani 60 15 25% 4 7%
 Middle East  Oman 121 11 9% 3 2.5%
 Pakistan  Sindhi 134 65 49% 2 1%
Table only shows positive sets from N = 3667 derived from 60 Eurasian populations sample, Underhill et al. (2009)

Concerning other sub-clades, R1a1a3 is defined by the M64.2, M87, and M204 SNPs and is apparently rare, for example it was found in 1 of 117 males typed in southern Iran.[12] And R1a1a6, defined by M434, was announced in Underhill et al. (2009). M434 was detected in 14 people (out of 3667 people tested) all in a restricted geographical range from Pakistan to Oman. This is likely to reflect a recent mutation that took place in the area of Pakistan.

Indications of other sub-clades of R1a1a

In order to seek further knowledge of the family tree within R1a1a, geneticists also study patterns in another type of mutation - specific unstable points on the Y chromosome known as STRs or microsatellites. Although these have a relatively high chance of random mutation each generation, they can often be useful when many are examined at once, and patterns are observed. The resulting pattern gives a kind of "DNA signature" referred to as an STR haplotype. Clusters of distinctly similar haplotypes often reflect common ancestry (which can then be confirmed by SNP investigation). Increasing the number of STR markers to be examined, or increasing the population sample size, reduces uncertainty about closeness of genetic relationship.[5][13] STR clusters which are understood to be major clades of R1a1a are discussed in both Gwozdz (2009) and Klyosov (2009).

Gwozdz (2009) has identified two clusters within R1a1a7 ("P" and "N"), neither yet defined by SNP. Cluster P was originally identified by Pawlowski (2002) and apparently accounts for about 8% of Polish men, making it the most common clearly identifiable haplotype cluster in Poland. Outside of Poland it is less common. Gwozdz estimated an age of approximately 2000 to 3000 years for cluster P. Cluster N is not concentrated in Poland, but is apparently common in many Slavic areas. Gwozdz also identified at least one large cluster of R1a1a* (not having M458), referred to as cluster K. This cluster is common in Poland but not centered there.

Klyosov (2009) notes a potential clade identified by a mutation on the relatively stable STR marker DYS388 (to an unusual repeat value of 10, instead of the more common 12), noting that this "is observed in northern and western Europe, mainly in England, Ireland, Norway, and to a much lesser degree in Sweden, Denmark, Netherlands and Germany. In areas further east and south that mutation is practically absent". He breaks this proposed clade into two distinct sub-clades, one of which is a successful younger branch which he estimates "a rather recent common ancestor, who lived around the 4th century CE".

Distribution of R1a1a (R-M17 or R-M198)

File:GlobalR1a1a.png
Frequency distribution of R1a1a, also known as R-M17 and R-M198, adapted from Underhill et al (2009).

R1a has been found in high frequency at both the eastern and western ends of its core range, for example in some parts of India and Tajikistan on the one hand, and Poland on the other. Throughout all of these regions, R1a is dominated by the R1a1a (R-M17 or R-M198) sub-clade.

Central and Northern Asia

R1a1a frequencies vary widely between populations within central and northern parts of Eurasia, but it is found in areas including Western China and Eastern Siberia. This big variation is possibly a consequence of population bottlenecks in isolated areas and the large movements of Turco-Mongols during the historic period. For example, exceptionally high frequencies of R1a1a (R-M17 or R-M198; 50 to 70%) are found among the Ishkashimis, Khojant Tajiks, Kyrgyzs, and in several peoples of Russia's Altai Republic.[14][15][16] Although levels are comparatively low amongst some Turkic-speaking groups (e.g. Turks, Azeris, Kazakhs, Yakuts), levels are very high in certain Turkic or Mongolic-speaking groups of Northwestern China, such as the Bonan, Dongxiang, Salar, and Uyghurs.[14][17][18] R1a1a is also found among certain indigenous Eastern Siberians, including:Kamchatkans and Chukotkans, and peaking in Itel'man at 22%.[19]

South Asia

In South Asia high levels of R1a1a been observed in some populations. For example, in the eastern and northern parts of India, among the high caste Bengalis from West Bengal like Brahmins and Kshatriyas (72%), Uttar Pradesh Brahmins (67%), Bihar Brahmins (60%), Punjab (47%), and Gujarat (33%) of male lineages[4] have been observed in this lineage. It is also found in relatively high frequencies in several South Indian Dravidian-speaking tribes including the Chenchu and Valmikis of Andhra Pradesh and the Kallar of Tamil Nadu suggesting that M17 is widespread in tribal southern Indians. To the south of India, it has also been found in >10% of Sinhalese in Sri Lanka.[20]

Middle East and Caucasus

R1a has been found in various forms, in most parts of Western Asia, in widely varying concentrations, from almost no presence in areas such as Jordan, to much higher levels in parts of Turkey and Iran.[21][22][23]

Wells et al. (2001), noted that in the western part of the country, Iranians show low R1ala levels, while males of eastern parts of Iran carried up to 35% R1a. Nasidze et al. (2004) found R1a in approximately 20% of Iranian males from the cities of Tehran and Isfahan. Regueiro et al. (2006), in a study of Iran, noted much higher frequencies in the south than the north.

Turkey also shows high but unevenly distributed R1a levels amongst some sub-populations. For example Nasidze et al. (2005) found relatively high levels amongst Kurds (12%) and Zazas (26%).

Further to the north of these Middle Eastern regions on the other hand, R1a levels start to increase in the Caucasus, once again in an uneven way. Several populations studied have shown no sign of R1a, while highest levels so far discovered in the region appears to belong to speakers of the Karachay-Balkar language amongst whom about one quarter of men tested so far are in haplogroup R1a1a.[2]

Europe

In Europe, R1a, again almost entirely in the R1a1a sub-clade, is found at highest levels among peoples of Eastern European descent (Sorbs, Poles, Russians and Ukrainians; 50 to 65%).[24][25][26] Levels in Hungarians have been noted between 20 and 60% [27] The Balkans shows lower frequencies, and significant variation between areas, for example >30% in Slovenia, Croatia and Greek Macedonia, but <10% in Albania, Kosovo and parts of Greece.[26][28][29] In the Baltic countries R1a frequencies decrease from Lithuania (45%) to Estonia (around 30%).[30]

R1a1 was present in Europe at least 4600 years ago, as demonstrated by Y-DNA with the Y-SNP marker SRY10831.2 extracted from the remains of three individuals near Eulau, Saxony-Anhalt, Germany, discovered in 2005. The discovery demonstrated the appearance of R1a1 with Corded Ware culture in Central Europe.[31][32]

There is a significant presence in peoples of Scandinavian descent, with highest levels in Norway and Iceland, where between 20 and 30% of men are in R1a1a.[33][34] Vikings and Normans may have also carried the R1a1a lineage westward; accounting for at least part of the small presence in the British Isles.[35][36][37][38]

In Southern Europe R1a1a is not normally common but it is widespread and found in significant pockets. Scozzari et al. (2001) found significant levels in the Pas Valley in Northern Spain, and also the areas of Venice, and Calabria in Italy.

Origins and hypothesized migrations of R1a1a

Median STR values for R1a1a
STR
site
Frequency
R1a1a(xM458) R1a1a7
DYS19 16 16
DYS388 12 12
DYS389I 13 13
DYS389II 17 16
DYS390 25 25
DYS391 11 10
DYS392 11 11
DYS393 13 13
DYS439 10 11
A7.2 10 10

Most discussions of R1a origins concern the dominant R1a1a (R-M17 or R-M198) sub-clade. There are two foci of high frequency of R1a1a, one in South Asia, near North India, and the other in Eastern Europe, in the area of the Ukraine. Until 2009 claims regarding the oldest R1a populations varied greatly between different articles, with Eastern Europe and South Asia being the main contenders. Such studies generally look at the STR haplotypes of each major population of R1a positive men. (These are the same markers mentioned above as being useful in trying to discover potential new branches within the R1a1a family tree.) Higher variation of STR haplotype in any particular region is normally seen as an rough indicator that a haplogroup has been present longer in that region. In order to gain more insight, the STR haplotypes are also often examined in detail, looking for clusters of more or less related male lines.

In 2009, several large studies of both old and new STR data, including Mirabal et al. (2009), Underhill et al. (2009), and Klyosov (2009) concluded that not only are there are two separate "poles of the expansion" with similar ages, but also that of these two poles, Asian R1a1a is apparently older than European R1a1a. The data is therefore said to be more consistent with Asian origins for R1a1a, as opposed to European origins, with a particular focus remaining upon South Asia.[39]


South Asian origin hypothesis

Coalescent time estimates for R1a1a(xM458) STR from Underhill et al. (2009)
Location TD
W. India 15,800
Pakistan 15,000
Nepal 14,200
India 14,000
Oman 12,500
N. India 12,400
S. India 12,400
Caucasus 12,200
E. India 11,800
Poland 11,300
Slovakia 11,200
Crete 11,200
Germany 9,900
Denmark 9,700
UAE 9,700

As data collections have built up, an increasing number of studies have found South Asia to have the highest level of diversity of Y-STR haplotype variation within R1a1a. Three recent studies have argued that South Asia is a likely original point of dispersal,[40] while four other studies have concluded that the data is at least consistent with this scenario.[41] The most thorough study as of December 2009, including a collation of retested Y-DNA from previous studies, makes a South Asian R1a1a origin the strongest proposal amongst the various possibilities.[2]

A particular interest has been taken in using R1a1a to investigate the long-presumed connection between Indo-Aryan origins and high caste Brahmins. For example Wells et al. (2001), noted that the Indo-European-speaking Sourashtrans, a population from Tamil Nadu in southern India, have a much higher frequency of M17 (R1a1a) than their Dravidian-speaking neighbours, the Yadhavas and Kallars, adding to the evidence that M17 is a diagnostic Indo-Aryan marker. On the other hand, some authors have not accepted this association. For example Saha et al. (2005) examined R1a1a in South Indian tribals and Dravidian population groups more closely, and their analyses of the haplogroups "indicated no single origin from any lineage but a result of a conglomeration of different lineages from time to time. The phylogenetic analyses indicate a high degree of population admixture and a greater genetic proximity for the studied population groups when compared with other world populations".

Age estimation techniques play a role in whether authors accept or reject any connection between Indo-Aryan languages, and R1a1a in any broad sense. In particular, researchers such as Underhill et al. and Mirabal et al., estimate the dispersal of R1a1a in India to be much older than the Indo-Aryan language family.

The proposal of Klyosov (2009) based upon STR cluster analysis is that Indian R1a1a combines two distinct clades not yet defined by SNP. One of these is approximately 4000 years old and appears similar to Eastern European R1a1a, thus apparently in accordance with the theory that some R1a1a came from the direction of the Eurasian Steppe, in association with an ancestral version of Indo-Aryan languages. The other R1a1a cluster is older and more uniquely Indian, and is estimated to have had a common male line ancestor about 7000 years ago according to his approximation. (Klyosov does not use the "evolutionarily effective" Zhivotovsky method as used by Underhill et al.)

Central Eurasian origin proposals

Cordaux et al. (2004) argued, citing data from 3 earlier publications, that R-M17 (R1a1a) Y chromosomes most probably have a central Asian origin.[42] Central Asia is still considered a possible place of origin by Mirabal et al. (2009) after their larger analysis of more recent data. However these authors do not clearly distinguish the case being made for Central Asia for the case being made for Asia, particularly South Asia, more generally.

Recently, looking at Chinese STR data not included in other studies Klyosov (2009) concluded that the common source of Indian and European R1a must be somewhere near the modern Chinese ethnic groups known as the Hui, Bonan, Dongxiang and Sala and approximately 20,000 years ago, possibly somewhere near southern Siberia. This will be discussed further in following sections.

Eastern European migration hypotheses

Theories that the earliest generations of R1a1a men originated in Eastern Europe have become less common with the publication of bigger and more international surveys. However suggestions have been made which associate the distribution of R1a clades with several proposed movements of people in history and prehistory in Eastern Europe. As usual, these suggestions mainly concern the R1a1a sub-clade defined by M17 or M198, because this is the dominant R1a clade, and the only one for which there is significant data.

Four approximate time periods are frequently mentioned by different authors, but they are not mutually exclusive given that R1a lineages may have been taken part in many different human movements over time in the same geographical region. In an article which is still very widely cited, Semino et al. (2000) proposed that there may have been two expansions, suggesting that the spread of R1a from a point of origin in Ukraine following the Last Glacial Maximum may have been magnified by the expansion of males from the Kurgan culture. In a study of the Balkans, Pericic et al. (2005) saw evidence for "at least three major episodes of gene flow" adding "possibly massive Slavic migration from A.D. 5th to 7th centuries" as a third. Below is a discussion of various proposals about R1a in Europe, broken into different periods.

Researchers using the "evolutionarily effective" dating method therefore suppose that any Neolithic or more recent dispersals of R1a1a could not represent the initial spread of the whole clade, and might be more visible in the distribution of a subclade or subclades. Underhill et al. (2009) remark on the "geographic concordance of the R1a1a7-M458 distribution with the Chalcolithic and Early Bronze Age Corded Ware (CW) cultures of Europe". However they also note evidence contrary to a connection: Corded Ware period human remains at Eulau from which Y-DNA was extracted of R1a haplogroup appear to be R1a1a*(xM458) (which they found most similar to the modern German R1a1a* haplotype.) An earlier paper speculated that "R1a [in Norway] might represent the spread of the Corded Ware and Battle-Axe cultures from central and east Europe."[34] Although Klyosov (2009) does not use the Zhivotovsky method, his interpretation of certain Balkan and Chinese data would lead him to agree that R1a1a was already present in Europe around 11,000 years ago, having departed from Asia closer to 20,000 years ago. However, in his scenario modern R1a1a in Europe is mainly a result of much later movements of R1a1a coming from a Balkan point of origin within Europe.

Europe from the Holocene to the Early Bronze Age

Semino et al. (2000) proposed that R1a1a originally spread from a Ukrainian refugium during the Late Glacial Maximum. This proposal is no longer the leading one, though it is still widely cited. Amongst recent publications, Underhill et al. (2009) does propose that R1a1a is old enough for this scenario, but finds it more likely that it was initially in Asia. These authors also estimate that R1a1a was in parts of Europe by approximately 11,000 years ago. Most age estimates for R1a1a having such an early presence in Europe come from papers using the "evolutionarily effective" methodology described by Zhivotovsky et al. (2004), the latest such example being Mirabal et al. (2009) and Underhill et al. (2009). Other methods, such as used by Klyosov (2009), tend to give much younger estimates for any given set of data. Klyosov and Zhivotovsky were amongst authors involved in an exchange in the journal Human Genetics in 2009 which was relevant to R1a age estimations.[43]

Researchers using the "evolutionarily effective" dating method therefore suppose that any Neolithic or more recent dispersals of R1a1a could not represent the initial spread of the whole clade, and might be more visible in the distribution of a subclade or subclades. Underhill et al. (2009) remark on the "geographic concordance of the R1a1a7-M458 distribution with the Chalcolithic and Early Bronze Age Corded Ware (CW) cultures of Europe". However they also note evidence contrary to a connection: Corded Ware period human remains at Eulau from which Y-DNA was extracted of R1a haplogroup appear to be R1a1a*(xM458) (which they found most similar to the modern German R1a1a* haplotype.) An earlier paper speculated that "R1a [in Norway] might represent the spread of the Corded Ware and Battle-Axe cultures from central and east Europe."[34]

Steppe cultures

Diachronic map showing the Centum (blue) and Satem (red) areas. The supposed area of origin of satemization is shown in darker red (Andronovo/Abashevo/Srubna cultures).

From the late Neolithic and into the Iron Age, archaeologists recognize a complex of inter-related and relatively mobile cultures living on the Eurasian steppe, part of which protrudes into Europe. Many of these are in turn associated with the dispersal of Indo-European languages, the most recent dispersal being the one which led to the Indo-Iranian family of languages becoming the dominant modern languages of regions from Kurdistan to Western China, including such civilizations as Persia and India. (With the Slavic and Baltic languages considered to represent a relatively closely related branch.)

Geneticists believing that they see evidence of R1a1a gene-flow from the Eurasian Steppe to India have frequently proposed the involvement of these Steppe cultures, Indo-European languages, and possibly with specific cultural traits such as Kurgan burials and horse domestication. All of these are generally felt to originate in the specifically European part of the steppes, which stretches as far west as the Ukraine.[44]

Such a Steppe origin for R1a1a has also been argued by Keyser et al. (2009) on the basis of DNA results from ancient remains from several South Siberian late Kurgan sites, including some from the Andronovo culture. 9 out of 10 male specimens were found to be in R1a1a. Two of the three from the Andronovo culture, "matched the most frequent R1a1 haplotype (12 loci) seen in the southern Siberian population" (a haplotype found also in Eastern Europe and Anatolia). More generally, the authors considered 8 of these 9 R1a1a haplotypes to be of a type typical of Slavic, Baltic and South Siberian populations. The R1a1a evidence was felt by the authors to support the proposal that the Steppes Kurgan culture spread from Europe to Siberia. R1a1a has also been found in central European ancient DNA samples Late Neolithic and Bronze Age.[31][45][46][47] In combination with the above-mentioned DNA samples found in Central Asia, the ancient DNA evidence is therefore thought to make it highly likely that R1a1a was present in or near any of the normally proposed staging points for the original dispersals of the Indo-Iranian and Balto-Slavic branches of the Indo-European language family, both thought to have originated in Eastern Europe, and both in the so-called "satem" cluster of Indo-European languages. (Balto-Slavic and Indo-Iranian also share the Ruki sound law.)

Based on analyses of STR diversity and clustering, Klyosov (2009) gives the most recent genetics based argument that there was a movement of R1a1a from Eastern Europe (specifically from the Balkans, he proposes) via the Steppes, to India, and associated with languages ancestral to Indo-Iranian and Slavic. According to Klyosov's analysis modern Indian R1a1a is made up of two components, an "Indo European" component which came from the direction of Europe, and another which must have been in India earlier, and which does not appear to derive from European lineages.

European migrations within the Historic Era

The spread of Slavic peoples and languages in late Classical times appears to have played a major role in further increasing the frequency of R1a1a in parts of Central and Eastern Europe, including parts of the Balkans, but if so then by all age estimates this would have been after R1a1a had already dispersed as widely as both Central Europe and India. So this is not an explanation of the origins and dispersal of R1a1a as a whole. Luca et al. (2006), looking at SNP and STR markers occurring in the Czech Republic suggested there was evidence for a rapid demographic expansion beginning about 60 to 80 generations ago, which would equate to about 1500 years ago (approx. 500 AD) to 2000 years ago (approx. 1 AD) with a generation time of 25 years. Rebala et al. (2007) also detected Y-STR evidence of a recent Slavic expansion from the area of modern Ukraine. This evidence corresponds to population movements during the late Classical Migration Period. Gwodzdz (2009) summarizes his extensive analysis of STR clustering the region of Poland proposes that R1a1a in that area shows signs that there was a "rapid population expansion somewhat less than 1,500 years ago in the area that is now Poland".

Middle Eastern origin hypothesis

As mentioned above, R1a haplotypes are less common in most of the Middle East than they are in either South Asia or Eastern Europe or much of Central Asia. It has nevertheless been mentioned in speculation about the origins of the clade. This is both because there are interesting pockets of high frequency and diversity, for example in some parts of Iran and amongst some Kurdish populations, and also because the rarer branches of R1a (R1a*, R1a1*) are more common in some of these regions.

Semino et al. (2000) proposed that a Middle Eastern origin for R1a should be considered, depending upon the strength of arguments for a Middle Eastern origin for Indo-European languages. However, Nasidze et al. (2004) suggested that R1a must have originally arrived there prior to any Kurgan/Indo-European expansion into the area, and that the R haplogroup as a whole including R1a may even have roots near Iran.

Most recently, Underhill et al. (2009) points out, as did Regueiro et al. (2006), and Kivisild et al. (2003) that the evidence used to argue for South Asian origins of R1a, does not exclude the possibility of a Middle Eastern origin:

The most distantly related R1a chromosomes, that is, both R1a* and R1a1* (inset, Figure 1), have been detected at low frequency in Europe, Turkey, United Arab Emirates, Caucasus and Iran (Supplementary Table S1[48]). The highest STR diversity of R1a1a*(xM458) chromosomes are observed outside Europe, in particular in South Asia (Figure 1, Supplementary Table S4), but given the lack of informative SNP markers the ultimate source area of haplogroup R1a dispersals remains yet to be refined.

Popular science

Bryan Sykes in his book Blood of the Isles gives imaginative names to the founders or "clan patriarchs" of major British Y haplogroups, much as he did for mitochondrial haplogroups in his work The Seven Daughters of Eve. He named R1a1a in Europe the "clan" of a "patriarch" Sigurd, reflecting the theory that R1a1a in the British Isles has Norse origins. It should be noted that this does not mean that there ever was any clan or other large grouping of people, which was dominated by R1a1a or any other major haplogroup. Real clans and ethnic groups are made up of men in many Y Haplogroups.

See also

Notes

  1. ^ Karafet et al. (2008)
  2. ^ a b c d e f g h i Underhill et al. (2009)
  3. ^ Mirabal et al. (2009)
  4. ^ a b Sharma et al. (2009)
  5. ^ a b Gwozdz (2009)
  6. ^ Klyosov (2009)
  7. ^ Identified by the authors with the standardized SNP reference rs34351054.
  8. ^ YCC (2002)
  9. ^ SRY1532.2 is also known as SRY10831.2
  10. ^ ISOGG phylogenetic tree
  11. ^ ISOGG phylogeny webpage 2009
  12. ^ Regueiro et al. (2006)
  13. ^ Klyosov (2009a)
  14. ^ a b Wells et al. (2001)
  15. ^ Kharkov et al. (2007)
  16. ^ Tambets et al. (2004)
  17. ^ Wang et al. (2003)
  18. ^ Zhou et al. (2007)
  19. ^ Lell et al. (2002)
  20. ^ Kivisild et al. (2003)
  21. ^ Flores et al. (2005)
  22. ^ Nasidze et al. (2004)
  23. ^ Nasidze et al. (2005)
  24. ^ Balanovsky et al. (2008)
  25. ^ Behar et al. (2003)
  26. ^ a b Semino et al. (2000)
  27. ^ Semino et al. (2000) found a level of 60% but a later study, Tambets et al. (2004), found haplogroup R1a Y-DNA in only 20.4% of a sample of 113 Hungarians. Rosser et al. (2000) found SRY1532b positive lineages in approximately 22% (8/36) of a Hungarian sample. Battaglia et al. (2008) found haplogroup R1a1a-M17 in approximately 57% of a sample of 53 Hungarians.
  28. ^ Rosser et al. (2000)
  29. ^ Pericic et al. (2005)
  30. ^ Kasperaviciūte et al. (2005)
  31. ^ a b Haak et al. (2008)
  32. ^ The Ysearch number for the Eulau remains is 2C46S.
  33. ^ Bowden et al. (2008)
  34. ^ a b c Dupuy et al. (2005)
  35. ^ Irish Heritage DNA Project, R1 and R1a
  36. ^ Passarino et al. (2002)
  37. ^ Capelli et al. (2003)
  38. ^ Garvey, D. "Y Haplogroup R1a1". Retrieved 2007-04-23.
  39. ^ Mirabal et al. (2009) additionally felt the data to be consistent with central Asian, while Underhill et al. (2009) took to the data to be consistent with Western Asian origins. Klyosov (2009) presents a more complex scenario in which R1a1a originated in South Siberia, branches headed to Europe and India, and then a branch from Europe also went to India.
  40. ^ see: Sengupta et al. (2005), Sahoo et al. (2006), and Sharma et al. (2009)
  41. ^ see: Kivisild et al. (2003), Mirabal et al. (2009), Underhill et all. (2009) and Gwozdz (2009)
  42. ^ Wells et al. (2001), Semino et al. (2000), and Quintana-Murci et al. (2001)
  43. ^ Klyosov (2009a), Hammer et al. (2009)
  44. ^ For several examples from 2002, see Semino et al. (2000), Passarino et al. (2001), Passarino et al. (2002) and Wells (2002)
  45. ^ Schilz (2006)
  46. ^ Bouakaze et al. (2007)
  47. ^ These samples were probably R1a1a* (M17/M198 positive, M458 negative) according to Underhill et al. (2009).
  48. ^ The authors also refer here to their references 14, Weale et al. (2001), and 41, Regueiro et al. (2006)]

References

Projects