search this blog

Monday, November 13, 2017

Who's your (proto) daddy Western Europeans?

Considering the increasingly large amounts of paleogenomic data being released online nowadays, it's no longer practical for me to try to highlight most archaeological cultures and even genetic clusters in my Principal Component Analyses (PCA) of the ancient world. Thus, from now on, I'll be focusing attention in such PCA on the main population shifts that have led to the formation of the modern-day West Eurasian gene pool and genetic substructures, like on the PCA plot below, which includes the new Lipson et al. 2017 data (available at the Reich Lab here).

The relevant PCA datasheet can be gotten here. By grouping several hundred ancient samples into just nine clusters, I'm attempting to highlight four key processes and resulting genetic shifts in Europe, the Near East and Central Asia:

- European forger populations mixing with genetically much more southern early farmers of Near Eastern origin, mostly during the Neolithic, bringing about the total disintegration of the Europe to Siberia Hunter-Gatherer cline

- "Old Europeans" getting overrun and largely absorbed by Y-haplogroup R1-rich Kurgan pastoralists from the Pontic-Caspian steppe during the Eneolithic and Bronze Age, leading to the formation of at least one major new cline from the Bronze Age steppe into post-Kurgan expansion Europe

- the ancient Near East "imploding" or becoming significantly more compact in terms of genetic structure, likely due to a variety of major population expansions from the chalcolithic onwards from the eastern and western parts of the Fertile Crescent, as well as probably the Caucasus and Europe (note how the post-Neolithic western Asian cluster stretches out towards Europe)

- fully nomadic and very wide ranging pastoral and warrior cultures dominating the entire Eurasian steppe during the Iron Age, leading to the emergence of progressively more East Asian-admixed populations from west to east across the Eurasian steppe

An interesting outcome of the denser sampling from space and time in West Eurasia is that Y-haplogroup R1b, once so elusive in the ancient DNA record, is now popping up all over the place. The new Lipson et al. dataset, for instance, includes two R1b "Old Europeans" from Blatterhole in Germany dated to the Middle Neolithic. Below is the same PCA as above except with all of the ancients belonging to R1b marked with an X. The two Blatterhole samples are sitting in the largely empty space between the European/Siberian Hunter-Gatherer cline and most of the "Old Europe" cluster. The relevant PCA datasheet is available here.

So it may seem that we're back to square one in the long running effort to pinpoint the origin of Y-haplogroup R1b-L51, which encompasses almost 100% of modern-day Western European R1b lineages, and thus probably ranks as Europe's most common Y-haplogroup. But at this stage I'd say no, because R1b-L51 is a subclade of R1b-M269, of which the oldest sample comes from the Bronze Age steppe. In fact, as can be seen in the above PCA, this sample is sitting in exactly the right spot to be one of those pastoralists who overran "Old Europe", or at least a very close relative thereof.

Or am I wrong? Feel free to let me know in the comments.

I didn't bother creating a similar plot of ancient samples belonging to Y-haplogroup R1a, because, unlike R1b, this marker is still non-existent in samples from outside of Eastern Europe and Siberia dating to before the late Neolithic. And I doubt that this is simply due to a lack of the right ancient material. Moreover, the recent discovery of Y-haplogroup R1a-M417, which encompasses almost 100% of all modern-day R1a lineages on the planet, in a North Pontic steppe sample belonging to the Eneolithic Sredny Stog culture means that it's game over for the naysayers as far as the steppe origin of most modern-day R1a lineages is concerned (see here and here).

In other words, if you're still hoping to see R1a, and especially R1a-M417, pop up in non-steppe derived ancient individuals in, say, such far away places as South Asia, then you'll probably be waiting forever.

For the linguistic implications of all of this, see...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Update 15/11/2017: After a couple of days of messing around with the Lipson et al. dataset, I'm certain that Late Copper Age sample Protoboleraz_LCA I2788 shows significant steppe-related admixture. This is the only sample from Lipson et al. with such an obvious signal of steppe-related input that had enough data to be analyzed individually by me with PCA and D-stats.

For the time being, amongst the best proxies for this signal appear to be Yamnaya_Samara and Samara_Eneolithic. But it's likely that the real source of the admixture is yet to enter the ancient DNA record, or at least my dataset. When it does, it'll probably be an Eneolithic pastoralist population from the North Pontic steppe.

Yamnaya_Samara also gives the best statistical fit as the single source population in qpAdm (see here). It's an important result, because it suggests that steppe peoples very similar to Yamnaya were already expanding on and out of the steppe as far back as ~3500 BCE, and perhaps a few hundred years earlier.

Thursday, November 9, 2017

Descendants of Greeks in the medieval Himalayas?

Below is an abstract from the upcoming Human Evolution 2017 conference (Cambridge, UK, November 20-22). It'll be interesting to see when the paper comes out how Harney, Patterson et al. uncovered the Greek affinities of some of these individuals; uniparental markers, rare alleles? The accompanying pic is from Wikipedia.

The skeletons of Roopkund Lake: Genomic insights into the mysterious identity of ancient Himalayan travelers

Eadaoin Harney, Niraj Rai, Nick Patterson, Kumarasamy Thangaraj, David Reich

The high-altitude lake of Roopkund, situated over 5000 meters above sea level in the Himalayas, remains frozen for almost 11 months out of the year. When it melts, it reveals the skeletons of several hundred ancient individuals, thought to have died during a massive hail storm during the 8th century, A.D. There has been a great deal of speculation about the possible identity of these individuals, but their origins remain enigmatic. We present genome-wide ancient DNA from 17 individuals from the site of Roopkund. We report that these individuals cluster genetically into two distinct groups-consistent with observed morphological variation. Using population genetic analyses, we determine that one group appears to be composed of individuals with broadly South Asian ancestry, characterized by diffuse clustering along the Indian Cline. The second group appears to be of West Eurasian related ancestry, showing affinities with both Greek and Levantine populations.

Tuesday, October 31, 2017

Genetic ancestry online store (to be updated regularly)

It's an unfortunate reality that most commercial genetic ancestry tests out there are rather lame. They're not wrong per se, but that's probably the best that can be said about them. And let's be honest, that's no longer enough considering how far this area of science has come in recent years.

To try and remedy this problem, I'll be offering a wide range of highly accurate and unique, but low cost, ancestry tests here, in my makeshift online store, based on analyses on this blog. These tests will focus on either recent or ancient ancestry, or both, using the latest reference samples from scientific literature whenever possible. To make a purchase, send your request, autosomal genotype data (from AncestryDNA, FTDNA or 23andMe) and money (via PayPal) to eurogenesblog [at] gmail [dot] com.

Let's start things rolling with my genetic and linguistic landscape of Europe north of the Alps, Balkans and Pyrenees (see here). For a mere $6 USD I will pinpoint your location on the plot below amongst a variety of modern-day and ancient individuals. You'll also receive the principal component coordinates, which you can use to model your ancestry proportions or produce heat plots (for instance, like here). Please keep in mind, however, that to ensure sensible results in this particular analysis, practically all of your ancestry has to derive from Central, Eastern and/or Northern Europe. Most of my other tests won't be so restrictive.

I'll be updating this plot regularly with many more ancient samples as they become available, but your coordinates will remain relevant as I do so.

Please note that my online store will be closed throughout December, but stay tuned next year for many more offers.

See also...

Fund-raising offer: Basal-rich K7 and/or Global 10 genetic map

Monday, October 30, 2017

On the wrong end of a steppe herder's cudgel (?)

From a new paper at the International Journal of Osteoarchaeology:

In this study, we examine trauma on human remains from the Tripolye site of Verteba Cave in western Ukraine. The remains of 36 individuals, including 25 crania, were buried in the gypsum cave as secondary interments. The frequency of cranial trauma is 30-44% among the 25 crania, six males, four females and one adult of indeterminate sex displayed cranial trauma. Of the 18 total fractures, 10 were significantly large and penetrating suggesting lethal force. Over half of the trauma is located on the posterior aspect of the crania, suggesting the victims were attacked from behind. Sixteen of the fractures observed were perimortem and two were antemortem. The distribution and characteristics of the fractures suggest that some of the Tripolye individuals buried at Verteba Cave were victims of a lethal surprise attack.


Recent paleogenomic studies have indicated that the nomadic pastoralists of the Pontic-Caspian steppe were involved in large-scale population movements at precisely this time, expanding westward farther into continental Europe (Haak et al., 2015). Such a massive population movement likely resulted in lethally violent interactions between indigenous populations and the newly arriving migrants.

Madden et al., Violence at Verteba Cave, Ukraine: New Insights into the Late Neolithic Intergroup Conflict, International Journal of Osteoarchaeology, online: 27 October 2017, DOI: 10.1002/oa.2633

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Massive migration from the steppe is a source for Indo-European languages in Europe (Haak et al. 2015 preprint)

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Genetic and linguistic structure across space and time in Northern Europe

I feel that I need to do a double take, and demonstrate more obviously why my new PCA, the one that I introduced in the recent Tollense Valley warrior blog post (see here), should prove very useful for analyzing both genetic and ethnolinguistic links in Northern Europe between modern-day populations and ancient samples, particularly those from late prehistory to early history, which is when the main ethnolinguistic groups that today dominate Northern Europe formed. Judging by some of the reactions in the comments, not everyone was convinced, so let's try this again.

Below is a new version of the said PCA that focuses on several ancient individuals who, based on their archaeological contexts, should show strong genetic affinities to modern-day speakers of Celtic, Germanic and Slavic languages in Northern Europe. These are three Iron Age samples from what is now England, one Iron Age sample from what is now Sweden, and two Medieval samples from what is now Bohemia, Czech Republic, respectively. The relevant datasheet is available here.

And clearly these ancients do show the expected genetic affinities considering where they cluster relative to modern-day Northern Europeans in the two most significant dimensions of genetic variation. Moreover, despite the fact that the Anglo-Saxon and English Iron Age samples were all excavated from sites in eastern England, the Anglo-Saxons cluster between the English Iron Age individuals and the singleton Scandinavian Iron Age sample. This of course makes perfect sense, considering that the Anglo-Saxons were Germanic speakers with recent ancestry from very near to Scandinavia.

So everything seems in good order, and for now it's very difficult for me to consider that those Tollense Valley warriors who cluster alongside modern-day Slavic speakers on my PCA are not ethnolinguistically closer to them than to Celtic and Germanic speakers.

On the other hand, my standard PCA of West Eurasian genetic variation does a comparatively lousy job at matching ethnolinguistic origins with genetic structure, at least in Northern Europe. Note below, for instance, that the same Celtic and Germanic samples from England and Scandinavia form a tight cluster between the two Slavs from Bohemia. Hence, based on this PCA it would be very difficult, perhaps impossible, to correctly predict the ethnolinguistic ties of these ancients just by looking where they cluster relative to modern-day Germanics, Slavs and so on. Right click and open in a new tab to enlarge to the max.

But this is not surprising, because this PCA is based on a wider, more diverse range of populations, and so rather than being dominated by relatively recent, ethnolinguistic-specific genetic drift within Northern Europe, it's much more reflective of deeper, more basic genetic relationships across West Eurasia.

See also...

Tollense Valley Bronze Age warriors were very close relatives of modern-day Slavs

Saturday, October 28, 2017

Global distributions of lactase persistence alleles (Liebert et al. 2017)

The series of maps below is from a new paper by Liebert et al. at Human Genetics. Almost certainly, any population with a sizable level of the 13910*T allele has relatively recent (post-Mesolithic) ancestry from Europe. In that context, note the presence of 13910*T in South Asia and North Central Africa. Populations in these regions also show high frequencies of two Y-chromosome haplogroups that are present in samples from Mesolithic Eastern Europe: R1a and R1b-V88, respectively. It's hard to imagine that this is a coincidence.

Liebert, A., López, S., Jones, B.L. et al., World-wide distributions of lactase persistence alleles and the complex effects of recombination and selection, Hum Genet (2017).

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

R1b-V88: out of the Balkans and into Africa?

Thursday, October 26, 2017

Ancient Guanches genetically most similar to modern-day Berbers (Rodríguez-Varela et al. 2017)

Over at Current Biology at this LINK. Emphasis is mine:

Summary: The origins and genetic affinity of the aboriginal inhabitants of the Canary Islands, commonly known as Guanches, are poorly understood. Though radiocarbon dates on archaeological remains such as charcoal, seeds, and domestic animal bones suggest that people have inhabited the islands since the 5th century BCE [1, 2, 3], it remains unclear how many times, and by whom, the islands were first settled [4, 5]. Previously published ancient DNA analyses of uniparental genetic markers have shown that the Guanches carried common North African Y chromosome markers (E-M81, E-M78, and J-M267) and mitochondrial lineages such as U6b, in addition to common Eurasian haplogroups [6, 7, 8]. These results are in agreement with some linguistic, archaeological, and anthropological data indicating an origin from a North African Berber-like population [1, 4, 9]. However, to date there are no published Guanche autosomal genomes to help elucidate and directly test this hypothesis. To resolve this, we generated the first genome-wide sequence data and mitochondrial genomes from eleven archaeological Guanche individuals originating from Gran Canaria and Tenerife. Five of the individuals (directly radiocarbon dated to a time transect spanning the 7th–11th centuries CE) yielded sufficient autosomal genome coverage (0.21× to 3.93×) for population genomic analysis. Our results show that the Guanches were genetically similar over time and that they display the greatest genetic affinity to extant Northwest Africans, strongly supporting the hypothesis of a Berber-like origin. We also estimate that the Guanches have contributed 16%–31% autosomal ancestry to modern Canary Islanders, here represented by two individuals from Gran Canaria.

Rodríguez-Varela et al., Genomic Analyses of Pre-European Conquest Human Remains from the Canary Islands Reveal Close Affinity to Modern North Africans, Current Biology (2017),

Tollense Valley Bronze Age warriors were very close relatives of modern-day Slavs

This is strongly suggested by the Principal Component Analysis (PCA) below, which shows that many of the Tollense Valley warriors (Welzin_BA) cluster in the Slavic-specific part of the plot. The relevant datasheet is available here.

I designed this PCA with the sole purpose of using Balto-Slavic-specific genetic drift to differentiate Slavs from Germans, except of course those Germans with a lot of Slavic ancestry, who are usually from eastern Germany and Austria. I can assure you, people who don't harbor significant Slavic ancestry never cluster in this part of the plot.

The only other ancient samples that cluster in the Slavic zone are, as expected, an early Slav from Bohemia and, interestingly, a Bronze Age individual from what is now Hungary. But we've already seen strong genetic, and indeed genealogical, links between another Hungarian Bronze Age genome and present-day Slavs (see figure 3 here).

So what's going on? Did the proto-Slavs come into existence during the Bronze Age, as opposed to the more generally accepted early Medieval Period? And did they expand from what is now Hungary? Or did they migrate there from the Baltic region? Thanks to Matt in the comments for the table below.

See also...

Tollense Valley Bronze Age battle: preliminary ancient DNA analysis

Genetic and linguistic structure across space and time in Northern Europe

Sunday, October 22, 2017

Tollense Valley Bronze Age battle: preliminary ancient DNA analysis

This dissertation, I'm guessing, is a prelude to a paper on the genetic origins of the victims of what was probably a large scale Bronze Age battle in the Tollense Valley, northern Germany:

Addressing challenges of ancient DNA sequence data obtained with next generation methods.

I blogged about the Tollense Valley project last year, following a Science feature which posited that the battle fallen may have come from very different parts of Europe (see here). But judging by the results in this thesis, that might not be the case after all. Emphasis is mine:

The 21 samples available to this study stem from skeletal remains found in the Tollense valley in north eastern Germany and date to the bronze age (ca. 3200 BP), except for sample WEZ16, which dates to the neolithic (ca. 5000 BP) and was found in a burial context. Although several samples from the Welzin site have been dated using the C 14 method, from the samples used for this study only the neolithic WEZ16 (2960BC ±66) and the Bronze Age sample WEZ15 (1007BC ±102) were radiocarbon dated. All individuals except WEZ16 were found in a non burial context, widely dispersed and dis-articulated [48] along the river bank of the Tollense river.


The PCA in Figure 4.24 shows modern Eurasian individuals in grey and ancient individuals in colour according to their assigned population (for details on the modern populations see Figure A.48). The majority of Welzin individuals fall within the variation of modern populations from the northern central part of Europe (compare Figure A.48), with hunter gatherers, the Yamnaya and the LBK populations appearing on the outer range of PC1 and PC2.


Outliers from the Welzin cluster are: WEZ16, which falls closer to the Sardinians and neolithic LBK along PC2, WEZ54, which clusters with the Basques and also fall closer to LBK individuals along PC2, WEZ57, which falls in between the former individual and the Welzin cluster, and WEZ56, which separates from the main cluster of Welzin individuals along PC2 in the opposite direction as the former three, towards the Corded Ware or Yamnaya.


The ancient population that share the most drift with the Welzin group are WHG and the SHG population followed by the Unetice, the Bell Beaker and the Corded Wear. Starting with the Unetice the following f3 values fall in the range of the standard error of each other. The average difference between two consecutive f3 values is 0.0021 ± 0.0024 and the average standard error in each f3 value is 0.0037 ± 0.0007. The most similar modern populations are the Polish, Austrians and the Scottish.


Any interpretation regarding possible parties that might have been involved in the conflict in the Tollense valley ∼ 3200 ago can only be speculative with regards to the here shown data. With the resolution given here, an educated guess for different involved parties could be, that both parties were relatively local and more closely related than any ancient DNA study was able to separate so far. Maybe similar to people from Hessen versus people from Rhineland-Palatinate in modern Germany.

Sell, Christian, Addressing challenges of ancient DNA sequence data obtained with next generation methods, Mainz : Univ. iii, 109 Seiten, 2017, Urn:nbn:de:hebis:77-diss-1000012793

See also...

Tollense Valley Bronze Age warriors were very close relatives of modern-day Slavs

Saturday, October 21, 2017

Hilariously wrong

From a recent paper at Forensic Science International:

The most commonly found haplogroups [among Lithuanians] are R1a and N, hence it can be argued that Lithuanians originate from Pakistan/Northwest India and East China/Taiwan.

Jankauskiene et al., Population data and forensic genetic evaluation with the YfilerTM Plus PCR Amplification kit in the Lithuanian population, Forensic Science International, DOI:

For a reality check see here...

R1a: The beast among Y-haplogroups