= Genetic history of the Iberian Peninsula =

The ancestry of modern Iberians, comprising the Spanish and Portuguese, reflects the Iberian Peninsula's position in the southwestern corner of Europe and broadly aligns with patterns typical of Southern and Western European populations.

As is the case for most of the rest of Southern Europe, the principal ancestral origin of modern Iberians are Early European Farmers who arrived during the Neolithic. The large predominance of Y-Chromosome Haplogroup R1b, common throughout Western Europe, is also testimony to a sizeable input from various waves of (predominantly male) Western Steppe Herders that originated in the Pontic-Caspian Steppe during the Bronze Age.

== Overview==
Modern Iberians' genetic inheritance largely derives from the pre-Roman inhabitants of the Iberian Peninsula who were deeply Romanized after the conquest of the region by the ancient Romans:

- Pre-Indo-European and Indo-European speaking pre-Celtic groups: Iberians, Lusitani, Vettones, Turdetani, Aquitani, Conii.
- Celtic groups: Gallaecians, Celtiberians, Turduli and Celtici.

Genetic research on medieval populations of the Iberian Peninsula indicates that, compared with Iron Age groups, there was a discernible ancestry shift associated with the Roman Imperial period toward sources related to those of Italy and Greece, contributing roughly one quarter of the genetic profile. This component remains in the genomes of modern Spaniards and Portuguese, except among a substantial proportion of the Basques, whose genetic makeup appears to remain most closely aligned with that of Iron Age inhabitants of Iberia.

There are also genetic influences from the Alans and post-Roman Germanic groups, including the Suebi, Hasdingi Vandals, and Visigoths. Owing to its location on the Mediterranean Sea, the region, like other Southern European countries, also interacted with a range of Mediterranean peoples, including the Phoenicians, Ancient Greeks, Carthaginians, Sephardi Jewish community, and the Berbers and Arabs who arrived during Al-Andalus, collectively contributing with minor North African and Eastern Mediterranean genetic influences, particularly in the south and west of the Iberian Peninsula.

Similar to Sardinia, Iberia's western geographic position limited settlement from the Middle East and Caucasus region. As a result, it shows lower levels of Western Asian and Middle Eastern admixture than Italy and Greece, with most of that ancestry likely arriving during historic periods, particularly under the Roman period, rather than in prehistory.

== Population genetics: methods and limitations ==

The foremost pioneer of the study of population genetics was Luigi Luca Cavalli-Sforza. Cavalli-Sforza used classical genetic markers to analyse DNA by proxy. This method studies differences in the frequencies of particular allelic traits, namely polymorphisms from proteins found within human blood (such as the ABO blood groups, Rhesus blood antigens, HLA loci, immunoglobulins, G-6-P-D isoenzymes, among others). Subsequently, his team calculated genetic distances between populations, based on the principle that two populations that share similar frequencies of a trait are more closely related than populations that have more divergent frequencies of the trait.

Since then, population genetics has progressed significantly and studies using direct DNA analysis are now abundant and may use mitochondrial DNA (mtDNA), the non-recombining portion of the Y chromosome (NRY) or autosomal DNA. MtDNA and NRY DNA share some similar features which have made them particularly useful in genetic anthropology. These properties include the direct, unaltered inheritance of mtDNA and NRY DNA from mother to offspring and father to son, respectively, without the 'scrambling' effects of genetic recombination. We also presume that these genetic loci are not affected by natural selection and that the major process responsible for changes in base pairs has been mutation (which can be calculated).

Whereas Y-DNA and mtDNA haplogroups represent but a small component of a person's DNA pool, autosomal DNA has the advantage of containing hundreds and thousands of examinable genetic loci, thus giving a more complete picture of genetic composition. Descent relationships can only to be determined on a statistical basis, because autosomal DNA undergoes recombination. A single chromosome can record a history for each gene. Autosomal studies are much more reliable for showing the relationships between existing populations but do not offer the possibilities for unraveling their histories in the same way as mtDNA and NRY DNA studies promise, despite their many complications.

== Analyses of nuclear and ancient DNA ==

Nuclear DNA analysis shows that Spanish and Portuguese populations are most closely related to other populations of western Europe.

A study published in 2019 using samples of 271 Iberians spanning prehistoric and historic times proposes the following inflexion points in Iberian genomic history:

1. Mesolithic: hunter-gatherers from North of the Pyrenees.
2. Neolithic: neolithic farmers settle the entire Iberian Peninsula from Anatolia.
3. Chalcolithic: Inflow of Central European hunter-gatherers.
4. Bronze Age: Steppe inflow from Central Europe.
5. Iron Age: Additional Steppe gene flow from Central Europe, the genetic pool of the Basque people remains mostly intact from this point on.
6. Roman period: genetic inflow from Central and Eastern Mediterranean. Some additional inflow of North African genes detected in Southern Iberia.
7. Visigothic period: Inflow from Central Europe.
8. Muslim period: Inflow from Northern Africa, particularly in Southern Iberia.
9. Reconquista: Repopulation from the Northern Iberian Christian kingdoms and the subsequent genetic convergence between Northern and Southern Iberia.

The patterned North African–related layer is highest in the south and west (notably southern Portugal and western Andalusia) and very low/near-absent in the Basque Country and parts of the northeast. Estimates for North African–related ancestry across mainland Iberia span ~0% to ~10–11%. Early autosomal models that explicitly include ancient/modern North African sources find mainland Iberia's sub-Saharan admixture is near zero and that most "African-like" signal reflects North African input, with any small sub-Saharan component in southwest Europe typically arriving alongside North African migrants and/or reflecting proxy choice; ancient DNA further shows ANA is a distinct North African lineage poorly represented by present-day sub-Saharan groups, so apparent West African affinities can reflect deep shared ancestry rather than recent gene flow.

== Haplogroups ==

===Y-chromosome haplogroups===

Like other Western Europeans, among Spaniards and Portuguese the Y-DNA Haplogroup R1b is the most frequent, occurring at over 70% throughout most of Spain. R1b is particularly dominant in Galicia, the Basque Country and Catalonia, occurring at rate of over 80%. In Iberia, most men with R1b belong to the subclade R-P312 (R1b1a1a2a1a2; as of 2017). The distribution of haplogroups other than R1b varies widely from one region to another. In Portugal as a whole the R1b haplogroups rate 50-60%, with some areas in the Northwest regions reaching over 80%.

R1b prevails in much of Western Europe, with the prevalence in Iberia of R-DF27 (R1b1a1a2a1a2a). This subclade is found in over 60% of the male population in the Basque Country and 40-48% in Madrid, Alicante, Barcelona, Cantabria, Andalucia, Asturias and Galicia. R-DF27 constitutes much more than the half of the total R1b in the Iberian Peninsula. Subsequent in-migration by members of other haplogroups and subclades of R1b did not affect its overall prevalence, although this falls to only two thirds of the total R1b in Valencia and the coast more generally. R-DF27 is also a significant subclade of R1b in parts of France and Britain. R-S28/R-U152 (R1b1a1a2a1a2b) is the prevailing subclade of R1b in Northern Italy, Switzerland and parts of France, but it represents less than 5.0% of the male population in Iberia. Ancient samples from the central European Bell Beaker culture, Hallstatt culture and Tumulus culture belonged to this subclade. R-S28/R-U152 is slightly significant in Seville, Barcelona, Portugal and Basque Country at 10-20% of the total population, but it is represented at frequencies of only 3.0% in Cantabria and Santander, 2.0% in Castile and León, 6% in Valencia, and under 1% in Andalusia.
Sephardic Jews
I1 0%	I2*/I2a 1%	I2 0%	Haplogroup R1a 5%	R1b 13%	 G 15% Haplogroup J2 2 25%	J*/J1 22%	 E-M2151b1b 9%	T 6%	Q 2%

Haplogroup J, predominantly subclades of Haplogroup J-M172 (J2), occurs at relatively low frequencies, being almost absent in Northern Iberia and reaching roughly 5–15% in Central and Southern Iberia. Haplogroup E has an overall frequency of about 10%: E-M78 (E1b1b1a1 in 2017) and E-M81 (E1b1b1b1a in 2017) each account for about 4.0%, with an additional ~1.0% from Haplogroup E-M123 (E1b1b1b2a1) and ~1.0% from unspecified subclades within E-M96.

===Mitochondrial DNA===

There have been a number of studies about the mitochondrial DNA haplogroups (mtDNA) in Europe. In contrast to Y DNA haplogroups, mtDNA haplogroups did not show as much geographical patterning, but were more evenly ubiquitous. Apart from the outlying Sami, all Europeans, including Iberians, are characterized by the predominance of haplogroups H, U and T. The lack of observable geographic structuring of mtDNA may be due to socio-cultural factors, namely patrilocality and a lack of polyandry.

The subhaplogroups H1 and H3 have been subject to a more detailed study and would be associated to the Magdalenian expansion from Iberia c. 13,000 years ago:

- H1 encompasses an important fraction of Western European mtDNA, reaching its local peak among contemporary Basques (27.8%) and appearing at a high frequency among other Iberians. Its frequency is above 10% in many other parts of Europe (France, Sardinia, British Isles, Alps, large portions of Eastern Europe), and above 5% in nearly all the continent. Its subclade H1b is most common in eastern Europe and NW Siberia.
- H3 represents a smaller fraction of European genome than H1 but has a somewhat similar distribution with peak among Basques (13.9%), Galicians (8.3%) and Sardinians (8.5%). Its frequency decreases towards the northeast of the continent, though. Studies have suggested haplogroup H3 is highly protective against AIDS progression.

A 2007 European-wide study including Spanish Basques and Valencian Spaniards found Iberian populations to cluster the furthest from other continental groups, implying that Iberia holds the most ancient European ancestry. In this study, the most prominent genetic stratification in Europe was found to run from the north to the south-east, while another important axis of differentiation runs east–west across the continent. It also found, despite the differences, that all Europeans are closely related.

===Subregions===

==== Spain ====

;Frequencies of Y-DNA haplogroups in Spanish regions

| Region | Sample size | C | E | G | I | J2 | JxJ2 | R1a | R1b | Notes |
| Aragon | 34 | | 6% | 0% | 18% | 12% | 0% | 3% | 56% | |
| Andalusia East | 95 | | 4% | 3% | 6% | 9% | 3% | 1% | 72% | |
| Andalusia West | 73 | | 15% | 4% | 5% | 14% | 1% | 4% | 54% | |
| Asturias | 20 | | 15% | 5% | 10% | 15% | 0% | 0% | 50% | |
| Basques | 116 | | 1% | 0% | 8% | 3% | 1% | 0% | 87% | |
| Castilla La Mancha | 63 | | 4% | 10% | 2% | 6% | 2% | 2% | 72% | |
| Castile North-East | 31 | | 9% | 3% | 3% | 3% | 0% | 0% | 77% | |
| Castile North-West | 100 | | 19% | 5% | 3% | 8% | 1% | 2% | 60% | |
| Catalonia | 80 | >0% | 3% | 6% | 3% | 6% | 0% | 0% | 81% | |
| Extremadura | 52 | | 18% | 4% | 10% | 12% | 0% | 0% | 50% | |
| Galicia | 88 | | 17% | 6% | 10% | 7% | 1% | 0% | 57% | |
| Valencia | 73 | >0% | 10% | 1% | 10% | 5% | 3% | 3% | 64% | |
| Mallorca | 62 | | 9% | 6% | 8% | 8% | 2% | 0% | 66% | |
| Menorca | 37 | | 19% | 0% | 3% | 3% | 0% | 3% | 73% | |
| Ibiza | 54 | | 8% | 13% | 2% | 4% | 0% | 0% | 57% | |
| Seville | 155 | | 7% | 4% | 12% | 8% | 3% | 1% | 60% | |
| Huelva | 22 | | 14% | 0% | 9% | 14% | 0% | 0% | 59% | |
| Cádiz | 28 | | 4% | 0% | 14% | 14% | 4% | 0% | 51% | |
| Córdoba | 27 | | 11% | 0% | 15% | 15% | 0% | 0% | 56% | |
| Málaga | 26 | | 31% | 4% | 0% | 15% | 0% | 8% | 43% | |
| León | 60 | | 10% | 7% | 3% | 5% | 2% | 7% | 62% | |
| Cantabria | 70 | | 13% | 9% | 6% | 3% | 3% | 4% | 58% | |

==== Portugal ====

A 2015 study updated and expanded earlier work on Portuguese mitochondrial DNA by analysing 292 complete control-region sequences from continental Portugal and applying a stringent quality protocol (including double sequencing). It also examined H-specific coding-region SNPs to refine haplogroup assignments and generated complete mitogenomes for samples in haplogroups U4 and U5. Overall, the results indicate a typical Western European haplogroup profile in mainland Portugal, alongside high mitochondrial genetic diversity, with no clear evidence of internal substructure within the country.

The AMH reaches its highest frequencies in the Iberian Peninsula, as well as in Great Britain, Ireland and Western France. Within Iberia, it is estimated at around 50-60% in Portugal overall, exceeding 80% in some areas of northwestern Portugal and approaching 90% in Galicia (NW Spain); the highest values are reported among the Basques (NE Spain). The Atlantic modal haplotype (AMH), also known as haplotype 15, is a Y chromosome haplotype defined by a set of Y-STR microsatellite markers and is strongly associated with the Haplogroup R1b. It was identified before many of the SNPs now used to distinguish R1b subclades, and is therefore frequently cited in older literature. It corresponds most closely to subclade R1b1a2a1a(1) [L11].

The AMH is the most common haplotype among male populations in Atlantic Europe. It is characterized by the following marker alleles:
- DYS388 12
- DYS390 24
- DYS391 11
- DYS392 13
- DYS393 13
- DYS394 14 (also known as DYS19)

==See also==

- Genetic history of Europe
