Ancient_genomes_reveal_origin_and_rapid

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Article

Ancient genomes reveal origin and rapid trans-


Eurasian migration of 7th century Avar elites
Graphical abstract Authors
Guido Alberto Gnecchi-Ruscone,
Anna Szécsényi-Nagy, István Koncz, ...,
Zuzana Hofmanová, Choongwon Jeong,
Johannes Krause

Correspondence
guido_gnecchi@eva.mpg.de (G.A.G.-R.),
cwjeong@snu.ac.kr (C.J.),
krause@eva.mpg.de (J.K.)

In brief
The Avars were a mysterious population
that settled the Carpathian Basin in 567/
68 CE, and their origins have remained
enigmatic. Genomic analyses of 66 pre-
Avar and Avar-period individuals,
integrated with archaeological and
historical data, suggest that Avar elites
underwent a long-distance, trans-
Eurasian migration from the East Asian
steppe.

Highlights
d Long-distance and rapid trans-Eurasian migration during the
7th century Avar period

d Striking genetic similarity between early Avar elites and the


Rouran in Mongolia

d Substantial genetic variation mirroring social and micro-


geographic structure

d High eastern Eurasian ancestry maintained in the Avar period


elites for 200 years

Gnecchi-Ruscone et al., 2022, Cell 185, 1–12


April 14, 2022 ª 2022 The Authors. Published by Elsevier Inc.
https://doi.org/10.1016/j.cell.2022.03.007 ll
Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll

Article
Ancient genomes reveal origin
and rapid trans-Eurasian migration
of 7th century Avar elites
Guido Alberto Gnecchi-Ruscone,1,32,* Anna Szécsényi-Nagy,2,32 István Koncz,3 Gergely Csiky,4 Zsófia Rácz,3
A.B. Rohrlach,1,5 Guido Brandt,6 Nadin Rohland,7,8 Veronika Csáky,2 Olivia Cheronet,9 Bea Szeifert,2 Tibor Ákos Rácz,10
András Benedek,11 Zsolt Bernert,12 Norbert Berta,13 Szabolcs Czifra,12 János Dani,14 Zoltán Farkas,13 Tamara Hága,14
Tamás Hajdu,15 Mónika Jászberényi,10 Viktória Kisjuhász,16 Barbara Kolozsi,14 Péter Major,13 Antónia Marcsik,17

(Author list continued on next page)

1Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
2Institute of Archaeogenomics, Research Centre for the Humanities, Eötvös Loránd Research Network, 1097 Budapest, Hungary
3Institute of Archaeological Sciences, ELTE Eötvös Loránd University, 1088 Budapest, Hungary
4Institute of Archaeology, Research Centre for the Humanities, Eötvös Loránd Research Network, 1097 Budapest, Hungary
5ARC Centre of Excellence for Mathematical and Statistical Frontiers, School of Mathematical Sciences, The University of Adelaide, Adelaide,

SA 5005, Australia
6Max Planck Institute for the Science of Human History, 07745 Jena, Germany
7Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
8Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
9Department of Evolutionary Anthropology, University of Vienna, 1030 Vienna, Austria
10Ferenczy Museum Center, 2000 Szentendre, Hungary
11Móra Ferenc Museum, 6720 Szeged, Hungary
12Hungarian National Museum, 1113 Budapest, Hungary
13Salisbury Ltd., 1016 Budapest, Hungary
14Déri Museum, 4026 Debrecen, Hungary
15Dept. of Biological Anthropology, Eötvös Loránd University (ELTE), 1117 Budapest, Hungary
16Aquincum Museum and Archaeological Park, 1031 Budapest, Hungary
17Dept. of Biological Anthropology, Szeged University, 6701 Szeged, Hungary
18Katona József Museum, 6000 Kecskemét, Hungary
19Department of Art History, Istanbul Medeniyet University, 34720 Istanbul, Turkey
20Research Centre for the Humanities, Eötvös Loránd Research Network, 1097 Budapest, Hungary
21Wosinsky Mór Museum, 7100 Szekszárd, Hungary

(Affiliations continued on next page)

SUMMARY

The Avars settled the Carpathian Basin in 567/68 CE, establishing an empire lasting over 200 years. Who they
were and where they came from is highly debated. Contemporaries have disagreed about whether they were,
as they claimed, the direct successors of the Mongolian Steppe Rouran empire that was destroyed by the
Turks in 550 CE. Here, we analyze new genome-wide data from 66 pre-Avar and Avar-period Carpathian
Basin individuals, including the 8 richest Avar-period burials and further elite sites from Avar’s empire core
region. Our results provide support for a rapid long-distance trans-Eurasian migration of Avar-period elites.
These individuals carried Northeast Asian ancestry matching the profile of preceding Mongolian Steppe pop-
ulations, particularly a genome available from the Rouran period. Some of the later elite individuals carried an
additional non-local ancestry component broadly matching the steppe, which could point to a later migration
or reflect greater genetic diversity within the initial migrant population.

INTRODUCTION Central Eurasian steppes in 567–568 CE, are an iconic exception.


Their empire, ruled by a khagan, dominated eastern Central Eu-
Long-distance migration to Europe is rarely reported in historical rope for over 200 years, until it was overcome by the Franks around
sources. The Avars, who arrived in the Carpathian Basin from the 800 CE (Curta, 2021; Daim, 1992; Pohl, 2018; Szádeczky-Kardoss,

Cell 185, 1–12, April 14, 2022 ª 2022 The Authors. Published by Elsevier Inc. 1
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Bernadett Ny. Kovacsóczy,18 Csilla Balogh,19 Gabriella M. Lezsák,20 János Gábor Ódor,21 Márta Szelekovszky,14
Tamás Szeniczey,15 Judit Tárnoki,22 Zoltán Tóth,23 Eszter K. Tutkovics,24 Balázs G. Mende,2 Patrick Geary,25
Walter Pohl,26,27 Tivadar Vida,3 Ron Pinhasi,9 David Reich,7,8,28,29 Zuzana Hofmanová,1,30 Choongwon Jeong,31,* and
Johannes Krause1,33,*
22Damjanich Museum, 5000 Szolnok, Hungary
23Dobó István Museum, 3300 Eger, Hungary
24Rétközi Múzeum, 4600 Kisvárda, Hungary
25Institute for Advanced Study, Princeton, NJ 08540, USA
26Institute for Medieval Research, Austrian Academy of Sciences, 1020 Vienna, Austria
27Institute of Austrian Historical Research, University of Vienna, 1010 Vienna, Austria
28Department of Human Evolutionary Biology, Cambridge, MA 02138, USA
29Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA
30Department of Archaeology and Museology, Faculty of Arts, Masaryk University, 60200 Brno, Czech Republic
31School of Biological Sciences, Seoul National University, 08826 Seoul, Republic of Korea
32These authors contributed equally
33Lead contact

*Correspondence: guido_gnecchi@eva.mpg.de (G.A.G.-R.), cwjeong@snu.ac.kr (C.J.), krause@eva.mpg.de (J.K.)


https://doi.org/10.1016/j.cell.2022.03.007

1990). The Byzantine texts agree that their move had been trig- heterogeneity, especially in the early phases (Bálint, 2019; Daim,
gered by the rise of the first Turkic khaganate in the 550s, centered 2003; Vida, 2016). We generated genome-wide data for 66
in what is now Mongolia, when Turks destroyed an empire called ancient individuals, covering both the pre-Avar (17 in a Sarma-
Rouran by its Chinese neighbors (Kradin, 2005). However, the texts tian and one in a Hunnic context, 4th–5th c. CE) and Avar periods
do not agree on who these Avars were, or where exactly they came (N = 48, 7th–8th century CE) (Figure 1). The latter originate from
from. In fact, the Turks claimed that they were only Pseudo-Avars three distinct regions. First, 25 individuals were collected from
who had appropriated the prestigious name Avars and the lofty title the Danube-Tisza Interfluve (DTI). Of these, 8 were high-status
of khagan but were in reality Ogurs, a Turkic-speaking people in male burials (Bócsa-Kunbábony group) dated to the second
western Central Eurasia. While we can conclude that the Rouran third of 7th c. CE, after the failed siege of Constantinople in
most likely called themselves Avars, to what extent the European 626 CE (Hurbanic , 2019), containing exquisite gold- and silver-
Avars were descended from them has been debated (Dobrovits, decorated weapons and belts, various insignia, and other
2003; Pohl, 2018). Here, we present genomic data that provide a prestige objects with Inner Asian parallels (Bálint, 2019; Vida,
new basis to reconstruct the early medieval long-distance move- 2016) The remaining 17 individuals were taken from contexts
ments of steppe peoples (Alves et al., 2016; Damgaard et al., with direct contact to this elite group (Data S1).
2018; Gnecchi-Ruscone et al., 2021) and an opportunity to inte- We also considered a group of burials mostly found east of the
grate genetic, historical, and archaeological evidence (Savelyev Tisza river (Transtisza), an assumed secondary power center,
and Jeong, 2020). which share numerous cultural similarities with the 6th–7th CE
Before the Avars arrived, the Romans had occupied the western population of the eastern European steppes. These are
part of the Carpathian Basin and the Sarmatians the eastern part characterized by solitary burials or small burial grounds, the
(c. 1–400 CE). The Romans were replaced by the short-lived em- deposition of partial or complete animals, and ledge or end niche
pire of the Huns (400–455 CE), and by diverse Germanic-speaking graves. Twelve of the 17 samples from the region belong to this
groups: Goths and Longobards in Pannonia, Gepids along the ‘‘Transtisza group’’ (Gulyás, 2016; Lo} rinczy, 2017). The remain-
Tisza (400 to c. 568). In 567/68, the Longobards destroyed the ing 6 Avar-period individuals were collected from neighboring re-
Gepid kingdom and moved to Italy, while the Avars conquered gions in the Carpathian Basin and reflect the heterogeneity of
the Carpathian Basin and its local population (Pohl, 2018). This archaeological material and the burial practices during the
study focuses on this momentous change and its genetic impact. Avar period, including two richly furnished elite burials from
Previous studies utilizing uniparental markers (Csáky et al., Transdanubia (Kölked), where the archaeological record
2020; Neparáczki et al., 2019) have provided suggestive genetic indicates a variety of local groups showing strong connections
evidence, but genome-wide data for reconstruction of the origins to the Mediterranean as well as the Merovingian world in western
of the Avar-period population are missing. We use nuclear DNA Europe (Vida, 2018).
to gain insights into the following questions: (1) can the origin of a
core Avar group from eastern Central Asia be confirmed from RESULTS
their genomic profile? (2) Were the elites of the newly arrived
steppe warriors genetically homogeneous or did they have Ancient DNA dataset and quality control
mixed ancestries? (3) How do the elites relate to the preceding After applying an in-solution enrichment protocol for 1.24 million
local population? informative single-nucleotide polymorphisms (SNPs) (Fu et al.,
The rich archaeological material of the Avar period in the 2013a; Mathieson et al., 2015), we sequenced the enriched
Carpathian Basin (late 6th c.–early 9th c. CE) consists of c. 600 libraries to a median 2.93 coverage for the ‘‘1,240K’’ target sites,
settlements and c. 100,000 excavated burials, which show great covering 13,749–1,119,583 target SNPs at least once (median

2 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Figure 1. Geographic and temporal locations of ancient individuals in this study


(A) A map of Eurasia with geographic coordinates of the ancient individuals analyzed in this study marked by color-filled shapes. Dark yellow shades mark the
steppe ecoregion. Newly produced genomes from the Carpathian Basin pre-Avar period are highlighted with white outlined symbols.
(B) A zoom-in map of the Carpathian Basin showing geographic coordinates of the newly analyzed ancient samples from the Avar period with symbols referring to
specific archaeological and social categories and colored according to the regions, as defined in the bottom left and right legend respectively.
(C) A rough timeline of Carpathian Basin and Mongolia from 200 BC to 950 AD.
See also Table S1.

Cell 185, 1–12, April 14, 2022 3


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Figure 2. Principal component analyses


(A) Pre-Avar Eurasian PCA (top) and west Eurasian PCA (bottom). New data are highlighted with white outlined and filled symbols.
(B) Eurasian PCA of newly produced Avar period individuals. Colors refers to key regions within the Carpathian Basin. Filled symbols are individuals retrieved from
elite contexts. Specific archaeological categories discussed in the text are shown with different symbol shapes.
See also Figure S1A.

887,064 SNPs). We find low mitochondrial contamination coverage was an important parameter when considering which
estimates (1%) for 61 of 66 individuals (one 3% and four unesti- samples we could include for imputation, we modeled the
mated) and low nuclear contamination estimates (<1%) for 36 of change in pairwise mismatch rate (pMMR) using mean coverage
38 males (Table S1). The remaining individuals do not have and contamination as predictor variables (see STAR Methods).
sufficient mitochondrial or X chromosome coverage to properly We found that while mean coverage was a significant predictor,
estimate contamination. We carry all 66 individuals into the contamination was not. Hence, spurious signals of ancestry
downstream analysis considering that individuals with no induced by minor levels of contamination did not significantly
contamination estimate still show similar genetic profiles as affect measures of pMMR before and after imputation, and we
others with low contamination (including also a first-degree expect that the very minor levels of damage found between the
relative pair) and that all individuals have clear ancient DNA- two terminal bases of each read will also have no significant
damage patterns and expected sex chromosomes coverage effect.
ratios (STAR Methods; Table S1). To address specific questions
detailed below we imputed the missing position of the 1,240K The genomic structure of the pre-Avar-period
panel in order to obtain diploid genotype calls that we could population
statistically phase and perform haplotype-based local ancestry Principal component analysis (PCA) reveals markedly different
analyses following a thorough evaluation of the success of the genetic profiles between pre-Avar- and Avar-period individuals
imputation and exclusion of poorly imputed samples (i.e., from the Carpathian Basin (Figures 2A and 2B, respectively).
samples with average genome-wide coverage <1.433; STAR With the exception of the only two Hunnic-period genomes
Methods; Figure S2A). To identify if more than just mean available (Hun_P_Budapest_5c, Hun_P_NTransdanubia_5c),

4 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

one newly generated and one previously published (Gnecchi- The genetic analysis does not use cultural assignments or
Ruscone et al., 2021), all pre-Avar individuals from the chronological information, but nonetheless clusters Avar-period
Carpathian Basin fall in the genetic variability of west Eurasians. DTI elite individuals according to the Avar chronology (early,
They mostly overlap present-day central and eastern Europeans, middle, and late). All of the early-Avar-period individuals
although a few individuals align with southern Europeans, (DTI_early_elite), except for an infant and a burial with typical
especially a previously described Longobard-period group characteristics of the Transtisza group (Figure 2B), form a tight
(Szolad_south_6c) that overlaps present-day southern Italians cluster with a high level of ANA ancestry. They are located
and Greeks (Figures 2A and S1; Amorim et al., 2018). Comparing between present-day Mongolic- (e.g., Buryats and Khamnigans)
within the Carpathian Basins’ ancient populations available and Tungusic/Nivkh-speaking populations (e.g., Negidals,
(hereafter refer to as ‘‘local’’ populations), the 16 newly analyzed Nanai, Ulchi, and Nivkhs) together with the only available ancient
4th–5th century CE late Sarmatian/Hunnic-period individuals fall genome from the Rouran-period Mongolia and are close to the
close to but deviate from the 8th–4th century BCE individuals of three AR_Xianbei_P_2c individuals in the PCA (Figures 2A and
the region (Carpathian_Basin_IA) and are closer but only partially 2B). Three out of six middle Avar-period individuals (DTI_mid-
overlapping to the 6th century CE Longobard-period ones dle_elite) fall within the DTI_early_elite cluster, two (DTI_mid-
(Amorim et al., 2018). Among the late Sarmatian-period dle_elite_o) slightly depart from it and the last one (a child burial)
individuals the ones from Transtisza (LS_P_Transtisza_4-5c) cluster in an intermediate position along the Eurasian PC1. The
overlap with the Szolad_others_6c group, the ones from DTI late Avar-period individuals (DTI_late_elite) are distinct from the
(LS_P_DTI_4-5c) are still closest to LS_P_Transtisza_4-5c indi- earlier ones as they are all shifted toward west Eurasia in PCA
viduals but shifted toward Hun_P_NTransdanubia_5c and the (Figure 2B).
Iron Age groups from the steppes (IA_PonticSteppe_4cBCE
and IA_SouthernUrals_5cBCE) (Damgaard et al., 2018; Modeling the eastern steppe ancestry of the elites in the
Gnecchi-Ruscone et al., 2021; Krzewin ska et al., 2018; core of the Avar empire
Unterländer et al., 2017). These observations on PCA are Genetic ancestry modeling performed with the qpWave/qpAdm
supported and confirmed by the qpWave/qpAdm modeling of framework confirmed that the DTI_early_elite individuals and the
LS_P_Transtisza_4-5c and LS_P_DTI_4-5c (Figure S5C). three DTI_middle_elite can be modeled as carrying 88%–98% of
their ancestry from an ANA-related gene pool (90% on
The genomic structure of the Avar-period population average) while the DTI_late_elite individuals carry between
Contrary to the preceding periods, the Avar-period individuals 70% and 80% of such an ancestry source (Figures 3 and S1).
show considerable genetic variability as the sampled individuals We chose AR_Xianbei_P_2c from the Mogushan archaeolog-
are spread along the entire cline from West Eurasian to ical site (n = 3), contextually dated to 50–250 CE, as a proxy
Northeast Asian populations (Figure 2B). Despite this overall for ANA due to its relatively high coverage, it being closer in
heterogeneity, there are clear patterns of genetic substructure time and closer genetically to the Avar-period individuals
corresponding both with Carpathian Basin’s geography and with the easternmost ancestry pattern and because of an
social-archaeological categories (Figure 2B). All of the individ- historical connection between the Xianbei and Rouran
uals from the DTI elite sites have Northeast Asian ancestry Empires (Golden, 2013). Furthermore, local ancestry analyses
profiles falling along a genetic cline of present-day populations performed on the imputed-phased data revealed that when
from the Altai to Mongolia and to the Amur River Basin (Fig- masking the genomic regions of western ancestry, all of the
ure 2B). They also broadly fall within the variability of ancient in- Avar-period elite individuals cluster tightly together close to
dividuals associated with some of the main late Iron Age to early the position of AR_Xianbei_P_2c individuals (Figure S2C).
Medieval eastern steppe archaeological cultural horizons Nevertheless, replacing AR_Xianbei_P_2c with other ancient
(Figures 2A and 2B). Among them, a wider group consists of eastern Eurasian populations that carry high proportions of
3rd BCE–1st c. CE Xiongnu period individuals from the Mongolian ANA ancestry also yields fitting models with no qualitative
plateau (N = 46, Xiongnu_1c) that are overall highly heteroge- difference (Table S2). Among them, Rouran_P_6c yields fitting
neous and have been previously grouped into three clusters, models as a single source of ancestry for many of the DTI
based on their genetic profiles (Jeong et al., 2020). Other individ- early/middle elites, although these models have low statistical
uals include the 1st–3rd CE Xianbei period individuals from the power due to low coverage of the Rouran_P_6c individual
Amur River Basin (N = 3, AR_Xianbei_P_2c) (Ning et al., 2020) (Table S2). In turn, Rouran_P_6c can be modeled with the
and the Altai (N = 6, Berel_4c) (Gnecchi-Ruscone et al., 2021); same two-way sources at comparable proportions as the DTI
two Hun-period individuals, one individual from the early 5th early elite individuals (Figure 3).
century Carpathian Basin, one individual from the 4th century The remaining ancestry, 10% on average for the DTI early
Kazakh steppe (Hun_P_Budapest_5c and Hun_P_KazakhStep- and middle elite and 20%–30% in DTI late elite individuals,
pe_4c) (Gnecchi-Ruscone et al., 2021; Nagy, 2010) and one comes from a source carrying higher western Eurasian ancestry
6th c. CE Rouran-period individual from present-day Mongolia (Figures 3 and S1; Table S2). With the exception of two DTI late
(Rouran_P_6c) (Li et al., 2018). All of these individuals, albeit individuals, a broad range of ancient populations equally provide
variably mixed with other sources, have been shown to trace working models, both from the Pontic steppes and 4th–6th
their eastern Eurasian ancestry component to a genetic profile century Carpathian Basin groups (Table S2). We replicated the
referred to as the ‘‘ancient northeast Asians’’ (ANA) (Jeong qpWave/qpAdm modeling grouping the individuals based on
et al., 2019). chronology to enhance statistical power (with the exclusion of

Cell 185, 1–12, April 14, 2022 5


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Figure 3. Ancestry deconvolution performed with qpWave/qpAdm


Representative two- or three-way admixture models for the Avar period individuals. Early Avar-period on the left and middle-late Avar period on the right. The
figure reports the overall best models resulting from the evaluation of all the individual-based, group-based and local ancestry-based qpAdm models following
the rationale detailed in the STAR Methods section. Three possible sources of ancestry are tested, an eastern Asian steppe source (AR_Xianbei_P_2c is used for
all models reported in the figure), a Carpathian Basin source (blue side, represented by either one among the three Longobard-period Szólád groups
or the Sarmatian-period group), and a Pontic-to-Kazakh steppe source (green side, represented by either one of the Iron Age groups from the steppe or the
North_Caucasus_7c). In Figures S1, S4, and S5 are shown the specific sources within these geographic categories that provided fitting models for each of
the tested individuals.
See also Figures S1, S4, and S5 and Tables S2 and S4.

the outliers and one from each pair of related individuals). group-based analyses were consistent with the findings ob-
DTI_early_elite provides a good fit only with the Iron Age tained on the pseudohaploid data (Figure S2B). A caveat
IA_Chandman_3cBCE from western Mongolia and a marginal regarding the non-local source identified, however, is that no
non-fit with IA_SouthernUrals_5cBCE (c2 p = 0.07 and 0.049, data from the core area of 4th–6th c. CE Hunnic realm north of
respectively; Table S3). DTI_late_elite provides working models Persia are yet available to test for more-specific alternative
only with western proxies closer in time (1st millennium CE), geographic sources.
but the geographic source could not be fully resolved since local The inferred dates of admixture in the DTI elites corroborate
groups as well as a north Caucasus group from the 7th century the distinction between the early and late elite individuals: the
(North_Caucasus_7c) and a southern Kazakhstan group from early period elites, as well as the three middle period elite individ-
the 4th century (Konyr_Tobe_4c) provide fitting models uals, present older admixture dates compatible with the date
(Table S3). To further investigate the issue of local versus estimated for the Rouran-period individual, while the late elites
steppe/Caucasus origin of the western source in DTI_late_elite, and the two middle elite outliers show substantially more recent
we applied three-way competing models (STAR Methods), admixture dates (Figures 4 and S3). Per-individual admixture
contrasting pairwise combinations of local versus non-local date estimates for the early/middle elites range from the 4th
sources (Figure S1D). These revealed that for most individuals century BCE to the 3rd century CE and the group-based one falls
(DTI_late_elite1) except two (DTI_late_elite2) a non-local source in between, around the 1st century BCE (Figures 4 and S3). By
was preferred over a local one, for which North_Caucasus_7c contrast, the late elite individuals show more recent admixture
or present-day North Ossetians provided the best fits. Group- date estimates that fall within the Rouran or early European
based two-way models on these two subgroups confirmed this Avar periods, and more precisely toward the end of the Rouran
pattern (Tables 1 and S3), thus revealing higher resolution to period in the group-based analysis (Figures 4 and S3).
distinguish between western sources with a two-way model. Finally, even if we found a considerable number of closely
We also repeated qpAdm/qpWave ancestry modeling on the related individuals among the DTI elites within the same sites
imputed-phased data after performing local ancestry analysis and within both early and late periods (five 1st and one 2nd degree
and masking the eastern Eurasian genomic regions (i.e., pairs, including a trio; Data S1), they show no signs of recent
analyzing only the western ancestry of the tested individuals; inbreeding (except for one individual) (Figure S4). Nevertheless,
STAR Methods and Figures S2B and S2C). Both individual and the East Central Asian paternal lineage N-F4205 that was

6 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Table 1. p values of the group-based qpWave/qpAdm models of the two DTI_late_elite groups
Second sources (the first is DTI_early_elite) of two-way qpAdm models with OG1
Targets North_Caucasus_7c Szolad_north_6c LS_P_Transtisza_4 5c LS_P_DTI_4 5c
a 8
DTI_late_elite1 p value = 0.66 p value = 5.5 3 10 p value = 0.0004 p value = 0.0007
DTI_late_elite2 p value = 0.008 p value = 0.701a p value = 0.89a p value = 0.57a
DTI_late_elite1 (A1809, A1813, A1814, A1815, I18222) include individuals with predominantly non-local signal while DTI_late_elite2 (A1810, I18225)
include individuals with predominantly local signal from the three-way models in Figure S1D and Table S3.
a
p values > 0.05

detected previously as typical for the elite males buried in this two elite-related individuals included in our dataset are the
region (Csáky et al., 2020) shows continuity throughout the ones found at the early Avar-period site of Kölked in south
middle and late Avar periods (Table S1). All twenty Avar-period Transdanubia. They show two very different genetic profiles as
males from the DTI carried the N1a1a1a1a (N-F4218) lineage, the female (A1825) falls on top of the preceding local population,
and all but one could be assigned to the N-F4205 sub-branch, while the male (A1824) has a unique genetic ancestry with
typical for present-day Mongolian and Transbaikalian popula- respect to all of the other individuals described so far, showing
tions (Ilumäe et al., 2016). The mtDNA diversity has instead both a shift toward east Eurasia as well as toward the
been shown to be remarkably higher (Csáky et al., 2020). This, Iranian/Caucasus-related gene pool, plotting on top of pre-
coupled with the absence of genomic signs of inbreeding or of sent-day Tajiks and close to ancient Caucasus steppes such
evidently reduced population sizes (Figure S4), could suggest as North_Caucasus_7c, Konyr_Tobe_4c and Kangju_3c
patrilocality/patrilineality practices as well as higher female (Figures 2B and S1). In agreement with this observation, A1825
exogamy that would have prevented inbreeding. can be modeled as deriving 100% of its ancestry from local
preceding sources, while A1824 can only be modeled as a
The heterogeneous ancestry in the regions surrounding mixture between a 20% ANA source (e.g., AR_Xianbei_P_2c)
the Avar empire’s core and 80% North_Caucasus_7c, or present-day North Ossetians
The remaining 23 individuals come from the neighboring regions (Figures 3 and S5A; Table S4). The 5th CE admixture date
surrounding the DTI. Compared with the elites of the Avar obtained for this individual corroborates the interpretation of a
empire’s core region, they are widely spread in PCA space along non-local-recently admixed ancestry (Figures 4 and S3). The
a west-to-east ancestry cline, from the local preceding gene remaining 9 late-Avar-period individuals show minor (<40%) to
pools (represented by Sarmatian- and Longobard-period almost negligible (<5%) admixture with ANA-related sources,
individuals) to the genetically easternmost DTI_early_elite cluster while the major ancestry component can be approximated by
(Figures 2B and S1). Nevertheless, only individuals retrieved one of the preceding local groups for most of the individuals
from Transtisza group burials carry >50% of ANA-related (Figures 3 and S5A; Table S4). Admixture dates confirm a general
ancestry (7 out of 13 Transtisza group individuals; Figures 3 pattern of more recent admixture occurring mostly during the
and S5A; Table S4). All except one were sampled from the earlier phases of the Avar empire although some slightly earlier
two elite sites (Figure S5A). The remaining Transtisza group dates, in the 5th century CE, could point to admixture events
individuals show little to no detectable ANA-related admixture with eastern sources that occurred during the Hunnic period
(Figures 3 and S5A). However, two Transtisza group individuals (Figures 4 and S3). Furthermore, our data suggest mostly
with no ANA ancestry show a higher affinity toward unidirectional gene flow as, with some exceptions previously
Mediterranean populations clustering with present-day Sicilians described (e.g., the two child/infant burials; Figure S1C), most
and Maltese, and plot at the very end of the Szolad_south_6c of the recently admixed individuals are found in non-elite sites.
genetic cluster (Figure S1). For this reason, they cannot be fully
modeled with any of the sources used in our study, although DISCUSSION
the Szolad_south_6c provides the models with best fit as a single
source when the different local preceding groups are contrasted Our results provide robust genetic support for the Northeast
in competitive models (Figure S5B). The genetic heterogeneity of Asian ancestry of the Avar-period elite in the core region of the
Transtisza group individuals is remarkable because this Avar empire (DTI) from the middle third of the 7th CE to the early
population has mostly been ascribed to immigrants from the 8th CE Carpathian Basin (early to middle Avar period). We show a
Pontic steppes (Gulyás, 2016; Lo } rinczy, 2017). It is likely due to striking genetic match with a Rouran-period individual as well as
varying degrees of admixture at different time points in the with ancient individuals from Xiongnu and especially Xianbei
past (Figures 4 and S3): the two individuals with the highest periods from the eastern Asian steppe. During the late Avar
ANA ancestry, indistinguishable from the DTI early elites, show period, we observe a shift among the elite in the Avar core
similarly old admixture dates, while the rest with more western area toward a more recently admixed ancestry. Even if late
Eurasian ancestry have younger and more variable admixture Avar individuals still preserve a predominant northern East
dates ranging from the 1st century CE to just a few generations Asian component, the western Eurasian source that best fits
before (Figures 4 and S3). This suggests that despite a common the remaining 20%–30% of their ancestry is mostly a non-local
cultural practice, individuals with very different genetic back- one (i.e., it does not match the gene pools of the available
grounds were inhumated in these types of cemeteries. The last preceding Carpathian Basin populations). Instead, it rather

Cell 185, 1–12, April 14, 2022 7


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Figure 4. Admixture dating obtained with DATES


Admixture dates and respective error bars obtained with DATES (between a western and eastern Eurasian source) plotted against the Euclidean distance from PC1
and PC2 coordinates of the Rouran_P_6c genome (PCA of Figure 2B). The colored bands mark the chronology of the three Avar and Rouran periods, used to derive
the dates of admixture from the estimated number of generations since admixture. All dates shown are individual-based except for the DTI elites (early, middle, and
late) for which the group-based date estimates are shown in the plot and therefore the individuals’ median Euclidean distance from Rouran_P_6c are used in this case.
See also Figure S3.

matches the steppes north of the Caucasus, although the In contrast to the overall uniform northern East Asian genetic
scarceness of comparative data from the steppe in the 1st profiles of the DTI elites, the individuals retrieved from other early
millennium CE calls for a future investigation of possible better Avar-period elite sites in the neighboring regions are much more
alternative sources. heterogeneous. The high-status individuals from the Transtisza
This non-local admixture together with the retention of a high group still carry the highest proportions of northern East Asian
level of eastern ancestry in the late elites and the absence of ancestry found outside the DTI. The more complex admixture
genetic inbreeding may point to continued migration from the patterns (more inter-individual variability in terms of admixing
steppes after the initial arrival of the Avars in the Carpathian sources, proportions, and dates of admixture) could be the
Basin. Alternatively, it may reflect admixture from lower-status results of different time-layers of connections with the eastern
groups, not included in this study, who are attested as having steppe, possibly going back to the Hunnic period or later
arrived together with the Avar elite. The high levels of eastern movements from the steppes to the Carpathian Basin.
ancestry among the elite maintained a century and more after Particularly remarkable are the two elite individuals from Kölked
the initial immigration may indicate a considerable size of the in southern Transdanubia, who display a similar western/southern
incoming population and marital networks restricted to it but European cultural habitus. However, they show very different
may also point to later movements and continuing contacts genetic histories between each other and with respect to the DTI
with Central Asia. Both males (N = 15) and females (N = 12) elites. The woman carries a local unadmixed pre-Avar-period
were detected with eastern profiles, which suggests that both genetic profile and is buried with grave goods associated with
women and men had arrived from eastern Asia. late-antique, Merovingian, and Byzantine traditions (Vida, 2018).

8 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

The male’s genetic ancestry points to the steppes north of the d KEY RESOURCES TABLE
Caucasus or other Iranianate regions. d RESOURCE AVAILABILITY
These results indicate the emergence of a genetically B Lead contact
heterogeneous local elite stratum under the rule of the immigrant B Materials availability
Avar elite population. Similarly, the more western genetic profile B Data and code availability
of non-elite early- and late-Avar-period individuals, as far as d EXPERIMENTAL MODEL AND SUBJECT DETAILS
represented in the present study, shows a stronger connection B Archaeological material and ethics permission
with the pre-Avar-period population of the Carpathian Basin. A B Historical background
number of recently admixed non-elite individuals, especially in B Archaeological background
the later period, reveal a variable amount of mostly unidirectional d METHOD DETAILS
gene flow from the Northeast Asian immigrants to the local pop- B Archeological dating
ulation. The Avar Empire was among the longest lasting of a se- B Ancient DNA processing and sequencing
ries of political shifts and population changes in the Carpathian d QUANTIFICATION AND STATISTICAL ANALYSIS
Basin between the 5th and 10th centuries (Bálint, 2019; Pohl, B Sequence data processing
2018). Evidence of similar population shifts during the Hunnic B Imputation
(5th century), Longobard (6th century), or Magyar (9th–10th cen- B Evaluation of imputation performance
turies) migrations is much less robust than the evidence pre- B Genetic relatedness
sented here. The only two Hunnic-period genomes available, B Compilation of population genetic data
analyzed above (Hun_P_Budapest_5c and Hun_P_NTransdanu- B Population genetic analysis
bia_5c), suggest a wide genetic variation for this mobile group B Haplotype phasing and local ancestry analyses
(see Figure 2A). Data collected from the Longobard period B Runs of homozygosity
analyzed in Amorim et al. (2018) indicate a heterogeneous pop-
ulation organized along a north-south European cline and, given SUPPLEMENTAL INFORMATION
the lack of genomic data from near-contemporary northern Eu-
rope, no clear evidence of a direct connection between the Supplemental information can be found online at https://doi.org/10.1016/j.cell.
2022.03.007.
Danubian region and Scandinavia can be drawn. It remains to
be seen to what extent genomic data will allow clear distinctions
ACKNOWLEDGMENTS
between central/northern European ancestries, for instance, in
the Longobard kingdom. For the Magyars, only uniparental This project has received funding from the European Research Council (ERC)
markers have been analyzed so far, and are of limited use with under the European Union’s Horizon 2020 research and innovation program
regard to the ca. 20%–30% of ‘‘central-inner’’ Asian origin of (grant agreement no. 856453 ERC-2019-SyG), the Max Planck Society, the
maternal and paternal lineages (Neparáczki et al., 2018, 2019). Hungarian Scientific Research Fund (OTKA-NKFI; grant number:
NN113157), the Szilágyi Family Foundation, the János Bolyai Research Schol-
In contrast to these poorly documented periods, our results
arship of the Hungarian Academy of Sciences (to A.S.-N.), the National
provide robust genetic support for the Northeast Asian ancestry Research Foundation of Korea (NRF) grant funded by the Korea government
of the Avar-period elite in the core region of the Avar empire (DTI (MSIT) (2020R1C1C1003879), the Allen Discovery Center program (a Paul G.
in the Carpathian Basin) from the middle third of the 7th CE to the Allen Frontiers Group advised program of the Paul G. Allen Family Foundation),
early 8th CE (early to middle Avar period). These results also allow the Czech Grant Agency (GACR 21-17092X), John Templeton Foundation
us to confidently conclude that an outstanding genetic variability grant 61220, and the Howard Hughes Medical Institute. We would like to thank
Ronny Barr of the Multimedia Department of the Max Planck Institute for
existed in the early Medieval Carpathian Basin, covering almost
Evolutionary Anthropology for the graphical support.
the entire genetic variation of present-day Eurasia and providing
clear evidence of long-distance trans-Eurasian migrations.
AUTHOR CONTRIBUTIONS

Limitations of the study Conceptualization, J.K., C.J., T.V., A.S.-N., G.A.G.-R., D.R., I.K., G.C., and
The results in this study indicate the emergence of a genetically Z.R.; formal analysis, G.A.G.-R., C.J., A.S.-N., and A.B.R.; investigation,
heterogeneous local elite stratum under the rule of the immigrant A.S.-N., G.B., N.R., V.C., O.C., and B.S.; resources, T.Á.R., A.B., Z.B., N.B.,
Avar elite population. These results are biased by non-random S.C., J.D., Z.F., T. Hága, T. Hajdu, M.J., V.K., B.K., P.M., A.M., B.N.K., C.B.,
G.M.L., J.G.Ó., M.S., T.S., J.T., Z.T., E.K.T., B.G.M., and R.P.; visualization,
selection of the burials that we analyzed, and an important
G.A.G.-R., C.J., A.S.-N., and A.B.R.; data curation, G.A.G.-R., A.S.-N., and
direction for future work is to carry out larger-sample-size J.K.; writing – original draft, G.A.G.-R., C.J., A.S.-N., I.K., G.C., Z.R., and
studies including of entire cemeteries in order to capture, as Z.H.; writing – review & editing, W.P., P.G., T.V., D.R., and J.K.; supervision,
best as it can be done for past populations, the full spectrum G.A.G.-R., Z.H., C.J., and J.K.
of society. Additional samples and hence resolution from
Northeast Asia would also be able to better characterize the DECLARATION OF INTERESTS
source region of the incoming population.
The authors declare no competing interests.

STAR+METHODS
Received: October 4, 2021
Revised: January 28, 2022
Detailed methods are provided in the online version of this paper Accepted: March 4, 2022
and include the following: Published: April 1, 2022

Cell 185, 1–12, April 14, 2022 9


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article
REFERENCES Fu, Q., Meyer, M., Gao, X., Stenzel, U., Burbano, H.A., Kelso, J., and Pääbo, S.
(2013b). DNA analysis of an early modern human from Tianyuan Cave, China.
1000 Genomes Project Consortium, Auton, A., Brooks, L.D., Durbin, R.M., Proc. Natl. Acad. Sci. USA 110, 2223–2227.
Garrison, E.P., Kang, H.M., Korbel, J.O., Marchini, J.L., McCarthy, S., McVean, Fu, Q., Mittnik, A., Johnson, P.L.F., Bos, K., Lari, M., Bollongino, R., Sun, C.,
G.A., and Abecasis, G.R. (2015). A global reference for human genetic Giemsch, L., Schmitz, R., Burger, J., et al. (2013a). A revised timescale for
variation. Nature 526, 68–74. human evolution based on ancient mitochondrial genomes. Curr. Biol. 23,
Allentoft, M.E., Sikora, M., Sjögren, K.-G., Rasmussen, S., Rasmussen, M., 553–559.
Stenderup, J., Damgaard, P.B., Schroeder, H., Ahlström, T., Vinner, L., et al. Fu, Q., Posth, C., Hajdinjak, M., Petr, M., Mallick, S., Fernandes, D.,
(2015). Population genomics of Bronze Age Eurasia. Nature 522, 167–172. Furtwängler, A., Haak, W., Meyer, M., Mittnik, A., et al. (2016). The genetic
Alves, I., Arenas, M., Currat, M., Sramkova Hanulova, A., Sousa, V.C., Ray, N., history of Ice Age Europe. Nature 534, 200–205.
and Excoffier, L. (2016). Long-distance dispersal shaped patterns of human Gansauge, M.-T., Aximu-Petri, A., Nagel, S., and Meyer, M. (2020). Manual and
genetic diversity in Eurasia. Mol. Biol. Evol. 33, 946–958. automated preparation of single-stranded DNA libraries for the sequencing of
DNA from ancient biological remains and other sources of highly degraded
Amorim, C.E.G., Vai, S., Posth, C., Modi, A., Koncz, I., Hakenbeck, S., La
DNA. Nat. Protoc. 15, 2279–2300.
Rocca, M.C., Mende, B., Bobo, D., Pohl, W., et al. (2018). Understanding
6th-century barbarian social organization and migration through paleogenom- } sı́rja Maglódon. Das awarenzeitliche
Garam, É. (2005). Avar kori nemzetségfo
ics. Nat. Commun. 9, 3547. Sippenhäuptlingsgrab von Maglód. ComArchHung, 407–436.
Gnecchi-Ruscone, G.A., Khussainova, E., Kahbatkyzy, N., Musralina, L.,
Bálint, C. (2019). The Avars, Byzantium and Italy: A Study in Chronology and
Spyrou, M.A., Bianco, R.A., Radzeviciute, R., Martins, N.F.G., Freund, C.,
Cultural History (Archaeolingua).
Iksan, O., et al. (2021). Ancient genomic time transect from the Central Asian
Breuer, E. (2005). Byzanz an der Donau: eine Einführung in Chronologie und Steppe unravels the history of the Scythians. Sci. Adv. 7, eabe4414.
Fundmaterial zur Archäologie im Frühmittelalter im mittleren Donauraum (Bre-
Golden, P.B. (2013). Some notes on the Avars and Rouran. In The Steppe
uer, Eric).
Lands and the World Beyond Them : Studies in Honor of Victor Spinei on
Chang, C.C., Chow, C.C., Tellier, L.C., Vattikuti, S., Purcell, S.M., and Lee, J.J. His 70th Birthday, V. Spinei, F. Curta, B.-P. Maleon, W. Treadgold, and J.
(2015). Second-generation PLINK: rising to the challenge of larger and richer ții ‘‘Alexandru Ioan Cuza’’ din Iași), pp. 43–66.
Shepard, eds. (Editura Universita
datasets. GigaScience 4, 7. Gulyás, B. (2016). One people, two regions? Thoughts on the early Avar period
Csáky, V., Gerber, D., Koncz, I., Csiky, G., Mende, B.G., Szeifert, B., Egyed, B., system of relationships in eastern Europe beyond the Tisza river. Hungarian
Pamjav, H., Marcsik, A., Molnár, E., et al. (2020). Genetic insights into the social Archaeology. http://www.hungarianarchaeology.hu/?page_id=279#post-6843.
organisation of the Avar period elite in the 7th century AD Carpathian Basin. Haak, W., Lazaridis, I., Patterson, N., Rohland, N., Mallick, S., Llamas, B.,
Sci. Rep. 10, 948. Brandt, G., Nordenfelt, S., Harney, E., Stewardson, K., et al. (2015). Massive
Curta, F. (2021). The long sixth century in eastern Europe (East Central and migration from the steppe was a source for Indo-European languages in
Eastern Europe in the Middle Ages, 450–1450) (BRILL). Europe. Nature 522, 207–211.
, M. (2019). The Avar Siege of Constantinople in 626 (Springer).
Hurbanic
Dabney, J., Knapp, M., Glocke, I., Gansauge, M.-T., Weihmann, A., Nickel, B.,
Valdiosera, C., Garcı́a, N., Pääbo, S., Arsuaga, J.-L., and Meyer, M. (2013). Ilumäe, A.-M., Reidla, M., Chukhryaeva, M., Järve, M., Post, H., Karmin, M.,
Complete mitochondrial genome sequence of a Middle Pleistocene cave Saag, L., Agdzhoyan, A., Kushniarevich, A., Litvinov, S., et al. (2016). Human
bear reconstructed from ultrashort DNA fragments. Proc. Natl. Acad. Sci. Y chromosome haplogroup N: a non-trivial time-resolved phylogeography
USA 110, 15758–15763. that cuts across language families. Am. J. Hum. Genet. 99, 163–173.

Daim, F. (1992). Awarenforschungen I-II. Archaeologia Austriaca Monogra- Jeong, C., Balanovsky, O., Lukianova, E., Kahbatkyzy, N., Flegontov, P.,
phien (Institut für Ur- und Frühgeschichte der Universität Wien). Zaporozhchenko, V., Immel, A., Wang, C.-C., Ixan, O., Khussainova, E.,
et al. (2019). The genetic history of admixture across inner Eurasia. Nat.
Daim, F. (2003). Avars and Avar archaeology. An introduction. In Regna, and Ecol. Evol. 3, 966–976.
Gentes: The Relationship Between Late Antique and Early Medieval Peoples
Jeong, C., Wang, K., Wilkin, S., Taylor, W.T.T., Miller, B.K., Bemmann, J.H.,
and Kingdoms in the Transformation of the Roman World, W. Pohl, H.-W.
Stahl, R., Chiovelli, C., Knolle, F., Ulziibayar, S., et al. (2020). A dynamic
Goetz, and J. Jarnut, eds. (BRILL), pp. 463–570.
6,000-year genetic history of Eurasia’s eastern steppe. Cell 183, 890–904.e29.
Damgaard, P.B., Marchi, N., Rasmussen, S., Peyrot, M., Renaud, G., Korne-
Jeong, C., Wilkin, S., Amgalantugs, T., Bouwman, A.S., Taylor, W.T.T., Hagan,
liussen, T., Moreno-Mayar, J.V., Pedersen, M.W., Goldberg, A., Usmanova,
R.W., Bromage, S., Tsolmon, S., Trachsel, C., Grossmann, J., et al. (2018).
E., et al. (2018). 137 ancient human genomes from across the Eurasian
Bronze Age population dynamics and the rise of dairy pastoralism on the
steppes. Nature 557, 369–374.
eastern Eurasian steppe. Proc. Natl. Acad. Sci. USA 115, E11248–E11255.
de Barros Damgaard, P., Martiniano, R., Kamm, J., Moreno-Mayar, J.V., Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P.L.F., and Orlando, L.
Kroonen, G., Peyrot, M., Barjamovic, G., Rasmussen, S., Zacho, C., (2013). mapDamage2.0: fast approximate Bayesian estimates of ancient
Baimukhanov, N., et al. (2018). The first horse herders and the impact of early DNA damage parameters. Bioinformatics 29, 1682–1684.
Bronze Age steppe expansions into Asia. Science 360, eaar7711.
Jun, G., Wing, M.K., Abecasis, G.R., and Kang, H.M. (2015). An efficient and
Delaneau, O., Howie, B., Cox, A.J., Zagury, J.-F., and Marchini, J. (2013). scalable analysis framework for variant extraction and refinement from
Haplotype estimation using sequencing reads. Am. J. Hum. Genet. 93, population-scale DNA sequence data. Genome Res. 25, 918–925.
687–696. Kennett, D.J., Plog, S., George, R.J., Culleton, B.J., Watson, A.S., Skoglund,
DePristo, M.A., Banks, E., Poplin, R., Garimella, K.V., Maguire, J.R., Hartl, C., P., Rohland, N., Mallick, S., Stewardson, K., Kistler, L., et al. (2017). Archaeo-
Philippakis, A.A., del Angel, G., Rivas, M.A., Hanna, M., et al. (2011). A genomic evidence reveals prehistoric matrilineal dynasty. Nat. Commun.
framework for variation discovery and genotyping using next-generation 8, 14115.
DNA sequencing data. Nat. Genet. 43, 491–498. Kloss-Brandstätter, A., Pacher, D., Schönherr, S., Weissensteiner, H., Binna,
Dobrovits, M. (2003). ‘‘They Called Themselves Avar’’—Considering the R., Specht, G., and Kronenberg, F. (2011). HaploGrep: a fast and reliable algo-
Pseudo-Avar Question in the Work of Theophylaktos. In Eran ud Aneran. rithm for automatic classification of mitochondrial DNA haplogroups. Hum.
Studies presented to Boris Il’ic Marsak on the occasion of his 70th birthday, Mutat. 32, 25–32.
M. Compareti, P. Raffetta, and G. Scarcia, eds. (Libreria Editrice Cafoscarina), Korneliussen, T.S., Albrechtsen, A., and Nielsen, R. (2014). ANGSD: analysis of
pp. 175–186. next generation sequencing data. BMC Bioinformatics 15, 356.

10 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article
Kradin, N.N. (2005). From tribal confederation to empire: the evolution of the components of Central-Inner Asian and Srubnaya origin in the conquering
Rouran society. Acta Orient. Acad. Sci. Hung. 58, 149–169. Hungarians. PLoS One 13, e0205920.
Krzewinska, M., Kılınç, G.M., Juras, A., Koptekin, D., Chylen
ski, M., Nikitin, Neparáczki, E., Maróti, Z., Kalmár, T., Maár, K., Nagy, I., Latinovics, D., Kustár,
A.G., Shcherbakov, N., Shuteleva, I., Leonova, T., Kraeva, L., et al. (2018). Á., Pálfi, G., Molnár, E., Marcsik, A., et al. (2019). Y-chromosome haplogroups
Ancient genomes suggest the eastern Pontic-Caspian steppe as the source from Hun, Avar and conquering Hungarian period nomadic people of the
of western Iron Age nomads. Sci. Adv. 4, eaat4457. Carpathian Basin. Sci. Rep. 9, 16569.
Lamnidis, T.C., Majander, K., Jeong, C., Salmela, E., Wessman, A., Moiseyev, Ning, C., Li, T., Wang, K., Zhang, F., Li, T., Wu, X., Gao, S., Zhang, Q., Zhang,
V., Khartanovich, V., Balanovsky, O., Ongyerth, M., Weihmann, A., et al. (2018). H., Hudson, M.J., et al. (2020). Ancient genomes from northern China suggest
Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry links between subsistence changes and human migration. Nat. Commun.
in Europe. Nat. Commun. 9, 5018. 11, 2700.
Lazaridis, I., Nadel, D., Rollefson, G., Merrett, D.C., Rohland, N., Mallick, S., Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y.,
Fernandes, D., Novak, M., Gamarra, B., Sirak, K., et al. (2016). Genomic in- Genschoreck, T., Webster, T., and Reich, D. (2012). Ancient admixture in
sights into the origin of farming in the ancient Near East. Nature 536, 419–424. human history. Genetics 192, 1065–1093.
Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Patterson, N., Price, A.L., and Reich, D. (2006). Population structure and
Burrows-Wheeler transform. Bioinformatics 25, 1754–1760. eigenanalysis. PLoS Genet. 2, e190.
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., and Nieselt, K.
Abecasis, G., and Durbin, R.; 1000 Genome Project Data Processing (2016). EAGER: efficient ancient genome reconstruction. Genome Biol. 17, 60.
Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Pohl, W. (2018). The Avars: A Steppe Empire in Central Europe (Cornell
Bioinformatics 25, 2078–2079. University Press), pp. 567–822.
Li, J., Zhang, Y., Zhao, Y., Chen, Y., Ochir, A., Sarenbilige, Z., Zhu, H., and R Development Core Team (2021). R: A language and environment for statis-
Zhou, H. (2018). The genome of an ancient Rouran individual reveals an impor- tical computing : reference index (R Foundation for Statistical Computing).
tant paternal lineage in the Donghu population. Am. J. Phys. Anthropol. 166,
Raghavan, M., Skoglund, P., Graf, K.E., Metspalu, M., Albrechtsen, A., Moltke,
895–905.
I., Rasmussen, S., Stafford, T.W., Jr., Orlando, L., Metspalu, E., et al. (2014).
Lipatov, M., Sanjeev, K., Patro, R., and Veeramah, K. (2015). Maximum likeli- Upper Palaeolithic Siberian genome reveals dual ancestry of Native Ameri-
hood estimation of biological relatedness from low coverage sequencing data. cans. Nature 505, 87–91.
Preprint at bioRxiv. https://doi.org/10.1101/023374.
Ralf, A., González, D.M., Zhong, K., and Kayser, M. (2018). Yleaf: software for
Lipson, M., Szécsényi-Nagy, A., Mallick, S., Pósa, A., Stégmár, B., Keerl, V., human Y-chromosomal haplogroup inference from next-generation
Rohland, N., Stewardson, K., Ferry, M., Michel, M., et al. (2017). Parallel palae- sequencing data. Mol. Biol. Evol. 35, 1820.
ogenomic transects reveal complex genetic history of early European farmers.
Nature 551, 368–372. Reich, D., Patterson, N., Campbell, D., Tandon, A., Mazieres, S., Ray, N.,
Parra, M.V., Rojas, W., Duque, C., Mesa, N., et al. (2012). Reconstructing
Loh, P.-R., Lipson, M., Patterson, N., Moorjani, P., Pickrell, J.K., Reich, D., and Native American population history. Nature 488, 370–374.
Berger, B. (2013). Inferring admixture histories of human populations using
linkage disequilibrium. Genetics 193, 1233–1254. Renaud, G., Slon, V., Duggan, A.T., and Kelso, J. (2015). Schmutzi: estimation
of contamination and endogenous mitochondrial consensus calling for ancient
} rinczy, G. (2017). Frühawarenzeitliche Bestattungßitten im Gebiet der Gro-
Lo DNA. Genome Biol. 16, 224.
ßen Ungarischen Tiefebene östlich der Theiß. Archäologische Angaben und
Bemerkungen zur Geschichte der Region im 6. und 7. Jahrhundert. Acta Ringbauer, H., Novembre, J., and Steinrücken, M. (2021). Parental relatedness
Archaeol. Acad. Sci. 68, 137–169. through time revealed by runs of homozygosity in ancient DNA. Nat. Commun.
12, 5425.
Maples, B.K., Gravel, S., Kenny, E.E., and Bustamante, C.D. (2013). RFMix: a
discriminative modeling approach for rapid and robust local-ancestry infer- Rohland, N., Glocke, I., Aximu-Petri, A., and Meyer, M. (2018). Extraction of
ence. Am. J. Hum. Genet. 93, 278–288. highly degraded DNA from ancient bones, teeth and sediments for high-
throughput sequencing. Nat. Protoc. 13, 2447–2461.
Mathieson, I., Alpaslan-Roodenberg, S., Posth, C., Szécsényi-Nagy, A.,
Rohland, N., Mallick, S., Olalde, I., Broomandkhoshbacht, N., Candilio, F., Rohland, N., Harney, E., Mallick, S., Nordenfelt, S., and Reich, D. (2015). Partial
Cheronet, O., et al. (2018). The genomic history of southeastern Europe. uracil-DNA-glycosylase treatment for screening of ancient DNA. Philos. Trans.
Nature 555, 197–203. R. Soc. Lond. B Biol. Sci. 370, 20130624.

Mathieson, I., Lazaridis, I., Rohland, N., Mallick, S., Patterson, N., Rooden- Salter-Townshend, M., and Myers, S. (2019). Fine-scale inference of ancestry
berg, S.A., Harney, E., Stewardson, K., Fernandes, D., Novak, M., et al. segments without prior knowledge of admixing groups. Genetics 212,
(2015). Genome-wide patterns of selection in 230 ancient Eurasians. Nature 869–889.
528, 499–503. Savelyev, A., and Jeong, C. (2020). Early nomads of the Eastern Steppe and
McColl, H., Racimo, F., Vinner, L., Demeter, F., Gakuhari, T., Moreno-Mayar, their tentative connections in the West. Evol. Hum. Sci. 2, e20.
J.V., van Driem, G., Gram Wilken, U., Seguin-Orlando, A., de la Fuente Castro, Schubert, M., Lindgreen, S., and Orlando, L. (2016). AdapterRemoval v2: rapid
C., et al. (2018). The prehistoric peopling of Southeast Asia. Science adapter trimming, identification, and read merging. BMC Res. Notes 9, 88.
361, 88–92. Sikora, M., Pitulko, V.V., Sousa, V.C., Allentoft, M.E., Vinner, L., Rasmussen,
Monroy Kuhn, J.M., Jakobsson, M., and Günther, T. (2018). Estimating genetic S., Margaryan, A., de Barros Damgaard, P., de la Fuente, C., Renaud, G.,
kin relationships in prehistoric populations. PLoS One 13, e0195491. et al. (2019). The population history of northeastern Siberia since the Pleisto-
Nagy, M. (2010). A hun-age burial with male skeleton and horse bones found in cene. Nature 570, 182–188.
Budapest. In Neglected Barbarians. Studies in the Early Middle Ages, F. Curta, Spiliopoulou, A., Colombo, M., Orchard, P., Agakov, F., and McKeigue, P.
ed. (Brepols Publishers), pp. 137–175. (2017). GeneImp: fast imputation to large reference panels using genotype
Narasimhan, V.M., Patterson, N., Moorjani, P., Rohland, N., Bernardos, R., likelihoods from ultralow coverage sequencing. Genetics 206, 91–104.
Mallick, S., Lazaridis, I., Nakatsuka, N., Olalde, I., Lipson, M., et al. (2019). Szádeczky-Kardoss, S. (1990). The Avars. In The Cambridge History of Early
The formation of human populations in South and Central Asia. Science 365. Inner Asia, D. Sinor, ed. (Cambridge University Press), pp. 206–228.
Neparáczki, E., Maróti, Z., Kalmár, T., Kocsy, K., Maár, K., Bihari, P., Nagy, I., Szenthe, G. (2019). The ‘‘Late Avar’’ Reform and the ‘‘Long eighth century’’: A
Fóthi, E., Pap, I., Kustár, Á., et al. (2018). Mitogenomic data indicate admixture Tale of the Hesitation between Structural Transformation and the Persistent

Cell 185, 1–12, April 14, 2022 11


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article
Nomadic Traditions (7th to 9th century AD). Acta Archaeol. Acad. Sci. Hung. Vida, T. (2016). ‘‘They asked to Be Settled in Pannonia.’’ A Study on Integra-
70, 215–250. tion and Acculturation – the Case of the Avars. In Between Byzantium and the
Unterländer, M., Palstra, F., Lazaridis, I., Pilipenko, A., Hofmanová, Z., Groß, Steppe: archaeological and Historical Studies in Honour of Csanad Balint on
M., Sell, C., Blöcher, J., Kirsanow, K., Rohland, N., et al. (2017). Ancestry the Occasion of His 70th Birthday, Á. Bollók, G. Csiky, and T. Vida, eds. (Insti-
and demography and descendants of Iron Age nomads of the Eurasian tute of Archaeology, Research Centre for the Humanities, Hungarian Academy
Steppe. Nat. Commun. 8, 14615. of Sciences), pp. 51–70.
Venables, W.N., and Ripley, B.D. (2003). Modern Applied Statistics with S
(Springer Science and Business Media). Vida, T. (2018). Being Avar! A case study for changes in the social display of
Vida, T. (2008). Conflict and coexistence: the local population of the Carpa- identity in the early Avar period. In Lebenswelten zwischen Archäologie und Ge-
thian Basin under Avar rule (sixth to seventh century). In The other Europe in schichte: Festschrift für Falko Daim zu seinem 65. Geburtstag, J. Drauschke, E.
the Middle Ages. Avars, Bulgars, Khazars, and Cumans, F. Curta, ed. (BRILL), Kislinger, K. Kühtreiber, T. Kühtreiber, G. Scharrer-Liska, and T. Vida, eds.
pp. 13–46. (Monographien des Römisch-Germanischen Zentralmuseums), pp. 419–436.

12 Cell 185, 1–12, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

STAR+METHODS

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER


Biological samples
Ancient skeletal element This study A1801
Ancient skeletal element This study A1802
Ancient skeletal element This study A1803
Ancient skeletal element This study A1804
Ancient skeletal element This study A1805
Ancient skeletal element This study A1806
Ancient skeletal element This study A1807
Ancient skeletal element This study A1808
Ancient skeletal element This study A1809
Ancient skeletal element This study A1810
Ancient skeletal element This study A1811
Ancient skeletal element This study A1812
Ancient skeletal element This study A1813
Ancient skeletal element This study A1814
Ancient skeletal element This study A1815
Ancient skeletal element This study A1816
Ancient skeletal element This study A1817
Ancient skeletal element This study A1818
Ancient skeletal element This study A1819
Ancient skeletal element This study A1820
Ancient skeletal element This study A1821
Ancient skeletal element This study A1822
Ancient skeletal element This study A1823
Ancient skeletal element This study A1824
Ancient skeletal element This study A1825
Ancient skeletal element This study I20801
Ancient skeletal element This study I20800
Ancient skeletal element This study I20798
Ancient skeletal element This study I20799
Ancient skeletal element This study I16812
Ancient skeletal element This study I16741
Ancient skeletal element This study I18742
Ancient skeletal element This study I18743
Ancient skeletal element This study I18744
Ancient skeletal element This study I18224
Ancient skeletal element This study I18223
Ancient skeletal element This study I18225
Ancient skeletal element This study I18222
Ancient skeletal element This study I16759
Ancient skeletal element This study I18174
Ancient skeletal element This study I18184
Ancient skeletal element This study I18185
Ancient skeletal element This study I16743
(Continued on next page)

Cell 185, 1–12.e1–e11, April 14, 2022 e1


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Ancient skeletal element This study I16744
Ancient skeletal element This study I16753
Ancient skeletal element This study I16752
Ancient skeletal element This study I16751
Ancient skeletal element This study I16750
Ancient skeletal element This study A181013
Ancient skeletal element This study A181014
Ancient skeletal element This study A181015
Ancient skeletal element This study A181016
Ancient skeletal element This study A181017
Ancient skeletal element This study A181018
Ancient skeletal element This study A181019
Ancient skeletal element This study A181020
Ancient skeletal element This study A181021
Ancient skeletal element This study A181022
Ancient skeletal element This study A181023
Ancient skeletal element This study A181024
Ancient skeletal element This study A181025
Ancient skeletal element This study A181026
Ancient skeletal element This study A181027
Ancient skeletal element This study A181028
Ancient skeletal element This study I20802
Ancient skeletal element This study A181029
Chemicals, peptides, and recombinant proteins
Destilled Water DNA free, UltraPure Thermo Fisher Scientific Cat# 10977035
0.5 M EDTA pH 8.0 Thermo Fisher Scientific Cat# AM9261
Proteinase K Thermo Fisher Scientific Cat# AM2548
Isopropanol Sigma Aldrich Cat# I9516
Guanidine hydrochloride Sigma Aldrich Cat# G4505
Sodium Acetate Solution (3 M), pH 5.2 Thermo Fisher Scientific Cat# R1181
Tween-20 Sigma Aldrich Cat# P2287
Buffer PE Qiagen Cat# 19065
Buffer PB Qiagen Cat# 19066
Tris-EDTA buffer solution Sigma Aldrich Cat# 93283
10x Buffer Tango Thermo Fisher Scientific Cat# BY5
ATP 100 mM Thermo Fisher Scientific Cat# R0441
BSA 20mg/mL Roche Cat# 10711454001
dNTP Mix Thermo Fisher Scientific Cat# R1121
USER enzyme New England Biolabs Cat# M5505
Uracil Glycosylase inhibitor (UGI) New England Biolabs Cat# M0281
T4 Polynucleotide Kinase New England Biolabs Cat# M0201
T4 DNA Polymerase New England Biolabs Cat# M0203
Bst DNA Polymerase, large fragment New England Biolabs Cat# M0275L
Ethanol Merck Cat# 1009831000
10x T4 Ligase Buffer Thermo Fisher Scientific Cat# EL0011
T4 DNA Ligase Thermo Fisher Scientific Cat# EL0011
10x Thermopol Buffer New England Biolabs Cat# B9004S
Ampure XP Bioscience Cat# BCI-A63881
(Continued on next page)

e2 Cell 185, 1–12.e1–e11, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Agilent D1000 ScreenTapes Agilent Technologies Cat# 5067-5582
Agilent D1000 Ladder Agilent Technologies Cat# 5067-5586
Agilent D1000 Reagents Agilent Technologies Cat# 5067-5583
Agarose Lonza Cat# 50004
HyperLadder 25bp Bioline Cat# BIO-33057
(formerly HyperLadder V),
ECO Safe Nucleic Acid Staining Thermo Fisher Scientific Cat# 3910001
Solution 20,000X
2X Hi-RPM Hybridization Buffer Agilent Technologies Cat# 5190-0403
PfuTurbo Cx Hotstart DNA Polymerase Agilent Technologies Cat# 600412
Herculase II Fusion DNA Polymerase Agilent Technologies Cat# 600679
Sodiumhydroxide Pellets Fisher Scientific Cat# 10306200
Sera-Mag Magnetic Speed-beads. GE LifeScience Cat# 65152105050250
Carboxylate-Modified (1 mm, 3EDAC/PA5)
Dynabeads MyOne Streptavidin Thermo Fisher Scientific Cat# 65602
SSC Buffer (20x) Thermo Fisher Scientific Cat# AM9770
GeneAmp 10x PCR Gold Buffer Thermo Fisher Scientific Cat# 4379874
Salmon sperm DNA Thermo Fisher Scientific Cat# 15632-011
Human Cot-I DNA Thermo Fisher Scientific Cat#15279011
5M NaCl Sigma Aldrich Cat# S5150
1M NaOH Sigma Aldrich Cat# 71463
1 M Tris-HCl pH 8.0 Sigma Aldrich Cat# AM9856
50x Denhardt’s solution Thermo Fisher Scientific Cat# 750018
Methanol, certified ACS VWR Cat# EM-MX0485-3
Acetone, certified ACS VWR Cat# BDH1101-4LP
Dichloromethane, certified ACS VWR Cat# EMD-DX0835-3
Hydrochloric acid, 6N, 0.5N & 0.01N VWR Cat# EMD-HX0603-3
5 M Sodium chloride solution Sigma-Aldrich Cat# S5150-1L
20% SDS Serva Cat# 39575.01
PEG-4000 Thermo Fisher Scientific Cat# EL0011
PEG-8000 Promega Cat# V3011
Critical commercial assays
MinElute PCR Purification Kit QIAGEN Cat# 28006
TwistAmp Basic Kit TwistDX Cat# TABAS03kit
Qubit dsDNA HS Assay Kit, 500 assays Thermo Fisher Scientific Cat# Q32854
High Pure Extender Assembly from the Roche Cat# 5114403001
Roche High Pure Viral Nucleic Acid Large
Volume Kit,40 reactions
MiSeq Reagent Kit v3 (150 cycle) Illumina Cat# MS-102-3001
NextSeq 500/550 High Output Illumina Cat# FC-404-2002
Kit v2 (150 cycles)
HiSeq Cluster Kit SR Illumina GD-410-1001
HiSeq 4000 SBS Kit (50/75 cycles) Illumina Cat# FC-410-1001/2
NextSeq 500/550 High Output Illumina Cat# FC-404-2002
Kit v2 (150 cycles)
DyNAmo Flash SYBR Green qPCR Kit Thermo Fisher Scientific Cat# F415L
Maxima SYBR Green kit Thermo Fisher Scientific Cat# K0251
Oligo aCGH/Chip-on-Chip Hybridization Kit Agilent Technologies Cat# 5188-5220
(Continued on next page)

Cell 185, 1–12.e1–e11, April 14, 2022 e3


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data
Raw and analyzed data This study ENA: PRJEB50368
(European nucleotide archive)
1240K Genotype data (Edmond Data This study https://edmond.mpdl.mpg.de/
Repository of Max Planck Society)
Software and algorithms
EAGER 1.92.55 Peltzer et al., 2016 https://eager.readthedocs.io/en/latest/
AdapterRemoval 2.2.0 Schubert et al., 2016 https://github.com/MikkelSchubert/
adapterremoval
BWA 0.7.12 Li and Durbin, 2009 http://bio-bwa.sourceforge.net/
DeDup 0.12.2 Peltzer et al., 2016 https://github.com/apeltzer/DeDup
mapDamage 2.0.6 Jónsson et al., 2013 https://github.com/ginolhac/mapDamage
bamUtil 1.0.13 https://github.com/statgen/bamUtil https://github.com/statgen/bamUtil
CircularMapper Peltzer et al., 2016 https://github.com/apeltzer/CircularMapper
ANGSD 0.910 Korneliussen et al., 2014 http://www.popgen.dk/angsd/index.php/ANGSD
Schmutzi Renaud et al., 2015 https://github.com/grenaud/schmutzi
SAMtools 1.3 Li et al., 2009 http://www.htslib.org/doc/samtools.html
pileupCaller https://github.com/stschiff/ https://github.com/stschiff/sequenceTools
sequenceTools
GATK v3.5 DePristo et al., 2011 https://gatk.broadinstitute.org/hc/en-us
GeneImp 1.4 Spiliopoulou et al., 2017 https://pm2.phs.ed.ac.uk/geneimp/
SHAPEIT v2.r790 Delaneau et al., 2013 https://mathgen.stats.ox.ac.uk/genetics_
software/shapeit/shapeit.html
pMMRCalculator https://github.com/TCLamnidis/ https://github.com/TCLamnidis/
pMMRCalculator pMMRCalculator
HaploGrep2 Kloss-Brandstätter et al., 2011 https://haplogrep.i-med.ac.at/
Yleaf v1.0 Ralf et al., 2018 https://github.com/genid/Yleaf
READ Monroy Kuhn et al., 2018 https://bitbucket.org/tguenther/read/
src/master/
lcMLkin Lipatov et al., 2015 https://github.com/COMBINE-lab/
maximum-likelihood-relatedness-estimation
EIGENSOFT v6.0.1 Patterson et al., 2006 https://github.com/DReichLab/EIG
ADMIXTOOLS 5.1 Patterson et al., 2012 https://github.com/DReichLab/AdmixTools
DATES v753 Narasimhan et al., 2019 https://github.com/priyamoorjani/DATES
MOSAIC v1.3 Salter-Townshend and Myers, 2019 https://maths.ucd.ie/mst/MOSAIC/
RFMix v2.03 Maples et al., 2013 https://github.com/slowkoni/rfmix
PLINK v. 1.9 Chang et al., 2015 https://www.cog-genomics.org/plink/
hapROH v0.3 Ringbauer et al., 2021 https://pypi.org/project/hapROH/
R v4.0.5 R Development Core Team, 2021 https://www.r-project.org/

RESOURCE AVAILABILITY

Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Johannes
Krause (krause@eva.mpg.de).

Materials availability
This study did not generate new unique reagents.

e4 Cell 185, 1–12.e1–e11, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Data and code availability


The newly produced aligned sequence data are deposited in the European Nucleotide Archive (ENA) with the following accession
number: PRJEB50368. Haploid genotype data for the 1240K panel is available in eigenstrat format via the Edmond Data Repository
of the Max Planck Society (https://edmond.mpdl.mpg.de/).

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Archaeological material and ethics permission


The authors declare that they had requested and got permission from the stakeholders, excavator and processor anthropologists
and archeologists for the destructive DNA analyses of the anthropological material presented in this study.
We generated new genome-wide data from skeletal remains of 66 ancient individuals from the following sites (see Data S1 for
archeological details):
Sarmatian period sites
Kecskemét, Mindszenti-du } (n = 8); Archaeological context newly reported.
} lo
Hajdúnánás, Fürj-halom-du } (n = 8); Archaeological context newly reported.
}lo
Derecske, Karakas-du } (n = 1); Archaeological context newly reported.
} lo
Hun period sites
Árpás, Dombiföld-Széru }skert (n = 1); Archaeological context newly reported.
Avar period sites
Békésszentandrás, Benda-tanya (n = 1); Archaeological context reported in Csáky et al. (2020).
Berettyóújfalu, Nagy-Bócs-du } (n = 1); Archaeological context newly reported.
}lo
Budapest, Csepel-Kavicsbánya (n =1); Archaeological context reported in Csáky et al. (2020).
Kunpeszér, Felso } peszéri út (n = 6); Archaeological context reported in Csáky et al. (2020).
Kecskemét, Sallai út (n = 1); Archaeological context reported in Csáky et al. (2020).
Debrecen, Bordás-tanya (n = 1); Archaeological context newly reported.
Derecske, Bikás-du } (n = 1); Archaeological context newly reported.
}lo
Derecske, Hosszú-lapos (n = 2); Archaeological context newly reported.
Derecske, Karakas-du } (n = 2); Archaeological context newly reported.
} lo
Hajdúböszörmény, Homokbánya IV (n = 1); Archaeological context newly reported.
Kölked, Feketekapu (n = 2); Archaeological context newly reported.
Kövegy, Nagy-földek (n = 1); Archaeological context newly reported.
Kunbábony (n = 1); Archaeological context reported in Csáky et al. (2020).
Kunszállás, Fülöpjakab (n = 7); Archaeological context reported in Csáky et al. (2020).
Peto} fiszállás (n = 1); Archaeological context reported in Csáky et al. (2020).
Albertirsa Site 22 (n = 2); Archaeological context newly reported.
Albertirsa, Szentmártoni út (n = 5); Archaeological context newly reported.
Alsónyék, Elkerülo } út (n = 1); Archaeological context newly reported.
Szalkszentmárton, Táborállás (n = 1); Archaeological context reported in Csáky et al. (2020).
Szarvas, Kovács-halom (n =4); Archaeological context reported in Csáky et al. (2020).
Tiszapüspöki, Holt-Tisza-part and Felso } -földek (n = 3); Archaeological context newly reported.
Visonta, Nagycsapás (n = 3); Archaeological context newly reported.

Historical background
In the 5th and 6th centuries, the Carpathian Basin (modern Hungary and the adjacent lowlands of southern Slovakia, western
Romania, northern Serbia and eastern Croatia and Austria) was a region particularly affected by migrations and changes of political
dominion, which are well-documented in contemporary Latin and Greek texts. Since the beginning of the first millennium, the Roman
Empire had ruled over the land west and south of the Danube, most of it in the province of Pannonia. Sarmatians, who had arrived
from the Pontic Steppes north of the Black Sea in the 1st century CE, lived between the Danube and the mountains of Transylvania.
Around 400, this relatively stable configuration was shattered by the arrival of the Huns. These Huns had arrived from the Central
Asian steppes in the lands north of the Black Sea in c. 375. In this region, the Goths, a people of Germanic language, had ruled
for some time; now, part of them became subjects of the Huns, while others flew to Roman territory.
The arrival of the Huns toppled the already precarious balance between the Roman Empire and its ‘barbarian’ neighbours (modern
scholars use this contemporary derogatory term for people outside the Ancient Civilisation in a descriptive sense, for lack of a better
word). In the course of what is often called ‘the Great Migration’, the Western Roman Empire dissolved, until the last western emperor
was dethroned in a coup by his army in 476. Many scholars have regarded this as a causal relation: barbarians caused ‘the Fall of
Rome’. However, internal problems of Roman society and the instability of its political system rather facilitated the barbarian takeover
than only being its consequences. The increasing Roman demand for soldiers was a decisive pull factor for barbarian immigrants.

Cell 185, 1–12.e1–e11, April 14, 2022 e5


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Also, the Eastern Roman Empire that we call Byzantium continued to exist for another millennium. Still, barbarian raids caused much
damage, and in the course of events, the Roman order was destroyed in some frontier provinces, such as Pannonia.
The contemporary sources classify the many barbarian groups increasingly operating on Roman territory by ethnic terms. This was
certainly not inaccurate, because we also get some glimpses of self-identification with these peoples, Huns, Goths, Franks or
Longobards, in the sources. Ethnicity was supposed to be based on common origin, but ethnic categories were in reality handled
rather flexibly. Changes of ethnic affiliation are repeatedly reported, and the composition of migrating groups often shifted. For
instance, the Roman diplomat Priscus met a Hun warrior at Attila’s court, who turned out to have been a Roman prisoner of war
who now preferred the life of the barbarians. Thus, we should not take these ethnonyms for granted as indicators of actual common
ancestry.
In the Carpathian Basin, the initially dispersed groups of Hunnic steppe riders united and initiated a series of disastrous raids on the
Balkan provinces, Gaul and Northern Italy. In circa 450 CE, under the rule of Attila, the short-lived Empire of the Huns reached the
apogee of its power. After the death of Attila in 453, the Hunnic realm collapsed, and its numerous components (in majority,
Germanic-speaking groups) built their own realms, or joined the Roman forces. Along the margins of the Carpathian Basin, Rugians,
Heruls, Sciri, Suebi and Sarmatians created ephemeral regional kingdoms which were all gone by 500. The only relatively stable
kingdom that emerged was that of the Gepids along the Tisza river, which lasted until 567. In Pannonia, Ostrogoths established their
rule, but then crossed into Roman heartlands in the 470s and ultimately conquered Italy in the 490s (where their kingdom lasted until
552/54, for some time expanding back into Pannonia). At that point, Longobards began to settle in Pannonia. In the mid-6th century,
they came into conflict with the Gepids, whose kingdom they finally destroyed in 567.
A year later, the Longobards collected their forces and those of numerous other groups (Gepids, Suebi, Pannonian provincials,
Sarmatians and others), and marched into Italy, where the kingdom they established played a major role until the Frankish king
Charlemagne conquered it in 774. Unlike in the Carpathian Basin, a late-antique society had more or less subsisted in Italy, with urban
centers, a well-organised Christian Church and legally defined property rights over a dependent class of farmers. Only remnants of
late-antique culture had survived in Pannonia into the 6th century, and with few exceptions the Roman cities were largely deserted.
The Carpathian Basin was rather thinly populated in most parts, except for the Gepid core areas. Its population seems to have been
rather mixed in origin, although the dominant warrior groups shared many forms of cultural expression – among them, inhumation of
the deceased with full dress and grave goods, which allows us to obtain precious information about some aspects of their lifestyle
and funerary practices.
When the Gepids were crushed and the Longobards left, the Avars took advantage of the situation and moved into the Carpathian
Basin, which they would dominate for over 200 years. They had arrived in the Pontic Steppes north of the Black Sea about ten years
before, led by the khagan (which was their rulers’ title) Baian. Soon, they defeated the mostly smaller groups of steppe warriors who
had replaced the Huns in the course of the later 5th century. These were mostly Turkic-speaking peoples such as the Utigurs, the
Cutrigurs or the Onogurs, sometimes generically called Bulgars in the texts. North of the Lower Danube, the Avars encountered Slavic
populations who had only appeared recently in the written record. The Avars had entered into an alliance with the Byzantine (or East
Roman) Empire in 558/59, and often sent envoys to its capital Constantinople (modern Istanbul). Only after they had consolidated
their rule over the Carpathian Basin did they start a series of major raids into the Balkan provinces, culminating in a siege of
Constantinople, together with a Persian army, in 626. The siege failed, and the mobilizing power of the Avar khagans collapsed.
Unlike the Huns, they rarely attacked their western neighbors, the Longobards in Italy and the Franks, whose Merovingian dynasty
dominated much of modern France and Germany.
Like the Huns, the Avars had arrived from the Central Asian steppes. We have better information about the circumstances of their
migration from written sources than about the Huns. With the Huns, we can only assume that they were in some ways connected to
the Xiongnu, who had ruled over a powerful steppe empire centered in modern Mongolia and repeatedly challenged the Chinese
empire of the Han Dynasty between the 2nd century BCE and the 1st century CE. Further west, Huns only appear in the 4th century:
the Hephthalites and some smaller realms in the steppes between the Caspian Sea and the Hindukush, and, of course, the European
Huns. In the east, Xiongnu power was eventually replaced by two other steppe peoples. One of them were the Xianbei, who mostly
lived in what is now Manchuria, temporarily extending their rule over parts of Mongolia in the 2nd/3rd century and over Northern China
in the 5th/6th centuries. The other people were called Rouran by the Chinese, and ruled by khagans. It is quite likely that they were
known as Avars in the steppe. Their empire was destroyed by the emerging khaganate of the Turks in 552.
What happened then is described in reports based on diplomatic exchanges between the Turks and Byzantium. The Turks denied that
the European Avars were directly descended from the Rouran, which would have given their khagans the ancient legitimacy of the Rouran
khaganate. They claimed that those who had fled Turkish expansion in the 550s were a mixed people called Warchonites, composed of
groups of former Rouran subjects, mainly Ogurs who had lived in western Central Asia. They had only adopted the prestigious name
Avars to frighten those whom they encountered on their westward migration. Indeed, we know from Chinese sources that many among
the Rouran had been killed by Turks and Chinese, and others had fled eastward, and reputedly ended up in Korea. Still, it is not unlikely
that we can assume that groups from the Western Eurasian steppes took part in the Avar migration, and that at least the Avar khagans and
their core group were actually descended from the Rouran. The results of the present article support this view. The written sources also
indicate that Cutrigurs, Utigurs and Bulgars from the Pontic steppes accompanied the Avars into the Carpathian Basin.
In the early Avar Empire, the archaeological record demonstrates that the population of the Carpathian Basin remained quite het-
erogeneous, at least culturally. The late-Roman element was reinforced by captives from the Balkan provinces and elsewhere settled

e6 Cell 185, 1–12.e1–e11, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

as dependent laborers. In the course of the 7th century, following the unsuccessful siege of Constantinople, the cultural differences
within the Avar settlement area largely faded out. Whether or not this led to wide-spread population admixture remains to be seen and
is one of the primary issues being investigated through HistoGenes, a European Research Council (ERC) project analysing genomic,
historical, and archaeological data in the Carpathian Basis between the fifth and tenth centuries.

Archaeological background
The Avars arrived to the Carpathian Basin at the end of the 6th century and united local peoples and communities of the Carpathian
Basin, including the remnants of the Romanised population in the territory of former Pannonia and various Barbarian groups with
heterogeneous background (Daim 2003; Bálint 2019). This heterogeneity is attested in its rich archaeological material consisting
of ca. 600 settlements and approx. 100,000 excavated burials. After its peak in the mid-7th century the Khaganate remained a
regional political power until the 8th century. The Avar Khaganate was made up of networks of overlapping groups organized along
territorial, cultural, social and other lines.
Concentration of high-status male burials (Bócsa-Kunbábony circle) suggests that the 7th-century Avar power center lay in the DTI.
These members of the early Avar elite (the leaders of the early Avar polity and the khagan’s military retinue) were buried with gold- and
silver-decorated weapons and belts, various insignias and valuable prestige objects. These artefacts are superbly crafted products
of the period’s goldsmithing that indicate their owners’ prominent social position and their long-distance connections and that they
remained part of the network that is the Eurasian steppe. Nomadic insignia of rank and power could have been an eagle-headed gold
sheet, it conjectured that it had perhaps adorned the sceptre. The lunula from a Kunbábony burial fits into an Inner Asian aristocratic
tradition (Vida 2016; Bálint 2019).
A group with different material culture, interpreted as Eastern European nomads settled the Trans-Tisza region. Their burials share
numerous similarities with 6th-7th-century Eastern European population and include Byzantine-influenced jewellery items and dress
accessories probably adopted in the Pontic region. These groups are characterised by solitary burials or small burial grounds in the
Early Avar period, the deposition of partial or complete sacrificial animals, and ledge or niche graves (Gulyás 2016; Lo } rinczy 2017).
Some elements of these burial customs can be traced until the end of the Avar period.
The West-Carpathian Basin (the former Roman province of Pannonia) served as border zone of Khaganate, where archaeological
material attests the survival and coexistence of various local groups with heterogeneous cultural background showing strong
connection to the Mediterranean (certain brooch and earring types) as well as the Merovingian world in Central and Western Europe.
The archaeological heritage of the surviving local communities showing western European connections comprises burials with
weapons (spatha, spear, sax) and Merovingian dress accessories (suspended belts, leg bindings, etc.). The rich, high quality find
assemblages in this region (for example Kölked) probably reflect a community with internal autonomy, which was obliged to render
armed service to the Avars under the leadership of their own local elite (Vida, 2008).
The Avar elite cemented the peoples of the Carpathian Basin with an immense integrative power in the 7th century and ensured the
powerful political position of the Avars in Europe. The ‘‘speed’’ and success of acculturation and integration is reflected by the
emergence of a uniform material culture across the Avar Khaganate within the 8th century, expressing the unity of the Avar political
organisation (Szenthe, 2019).

METHOD DETAILS

Archeological dating
Three samples have been 14C dated and in these cases 14C results agree with archeological dating (Data S1). The remaining samples
have been dated archaeologically. The core of our work (the elite burials) have a very well established archeological dating based on
the types and type chronology of artifacts found in the graves (pseudo-buckles, jewelry) (Bálint, 2019; Breuer, 2005; Daim, 2003;
Garam, 2005). Since they contain such a high concentration of fast-changing types, their dating can be considered very accurate
within a few decades. The empty or poorly furnished burials are more problematic for dating but constitute a minority of our samples
and they were dated based on their larger context.

Ancient DNA processing and sequencing


Laboratory experiments were performed at the Ancient DNA clean room facilities of the Institute of Archaeogenomics, Research
Centre for the Humanities, Budapest, at the laboratory of the Max Planck Institute for the Science of Human History, Jena, at the
Ancient DNA Lab, University of Vienna, and at Harvard Medical School, Boston. From each bone or tooth specimen, 30 to 70 mg
of powder was sampled and used for the following steps. DNA extraction and genomic library preparation of 42 samples labelled
with an initial ‘‘A’’ were done in Budapest, following well-established ancient DNA workflow protocols (Csáky et al., 2020; Lipson
et al., 2017). The DNA extraction was performed based on the protocol of (Dabney et al., 2013) with some modifications described
also by (Lipson et al., 2017). DNA libraries were prepared using UDG-half treatment methods (Rohland et al., 2015). We included
library negative controls and/or extraction negative controls in every batch. Unique P5 and P7 adapter combinations were used
for every library (Rohland et al., 2015). Double barcoded adaptor-ligated libraries were then amplified with TwistAmp Basic (Twist
DX Ltd), purified with AMPure XP beads (Agilent) and checked on a 3% agarose gel. Mitogenome capture was performed according

Cell 185, 1–12.e1–e11, April 14, 2022 e7


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

to Csáky et al. (2020) in Budapest and sequenced with 2x75 cycles on Illumina MiSeq. This study reports new mitogenome
sequences for the samples A181013-A181029, others from the Budapest lab were published in Csáky et al. (2020).
In Jena, an equimolar pool of 42 UDG-half libraries was prepared for shotgun sequencing on an Illumina HiSeq 4000 platform using
a single-end 76-cycle sequencing kit. All of the sequenced libraries showed high human endogenous DNA proportions (Table S1) and
ancient DNA characteristic damage patterns at the end of the fragments (5’ end at least 0.05%; Table S1). The libraries were
enriched using an in-solution enrichment protocol consisting of DNA probes specifically targeting 1,237,207 single nucleotide
markers (SNPs) genome-wide known to be variable in worldwide sets of human populations (Fu et al., 2013b; Haak et al., 2015;
Mathieson et al., 2015). For this, all libraries we re-amplified using the Herculase II Fusion DNA Polymerase (Agilent) to achieve
1-2 mg of total DNA in 5.2 ml (200-400 ng/ml), they were then purified using the MinElute DNA purification kit (QIAGEN) and their
concentrations were measured on a NanoDrop spectrophotometer (Thermo Fisher Scientific).
In Boston, all 24 samples (between 33 and 42 mg of powder) were incubated in 750 ml lysis buffer overnight and from 1/5th of the
lysis buffer DNA was extracted using Dabney binding buffer and silica magnetic beads utilizing an automated workflow on an Agilent
Bravo workstation (Rohland et al., 2018). The entire extract was then converted into libraries; 6 of the 24 libraries were built using a
single-stranded library preparation protocol following USER treatment and finished with dual indexing (Gansauge et al., 2020), and 18
libraries were built using a double-stranded library preparation that retains a subset of terminal Uracils, but removes all them internally
(Rohland et al., 2015); both library preparations were run on Agilent Bravo workstations and the double-stranded preparation utilized
silica magnetic beads instead of MinElute columns for the cleanup steps to be compatible with an automated workflow. Target
enrichment of above-mentioned nuclear SNPs (and MT genome) was done as referenced above.
Overall, the 66 enriched libraries were sequenced to 0.01-5.3x coverage for the ‘‘1240K’’ target sites (with a median of 2.9x),
covering 13,749 to 1,119,583 target SNPs at least once (with a median of 975,215 SNPs).

QUANTIFICATION AND STATISTICAL ANALYSIS

Sequence data processing


The fastq files containing the sequenced reads were processed through the EAGER 1.92.55 pipeline wrapper (Peltzer et al., 2016).
Prior to running EAGER, the first 7 bps of each read were trimmed to remove internal barcode sequences using a custom python
script. To remove adaptors and short reads (< 30 bp) AdapterRemoval v2.2.0 was used (Schubert et al., 2016). The reads were
then mapped to the Human Reference Genome Hs37d5 with bwa v0.7.12 aln/samse alignment algorithm (Li and Durbin, 2009)
with the parameters ‘‘-n’’ and ‘‘-l’’ set to 0.01 and 32 respectively. The reads with phred mapping quality R 30 were then discarded
using ‘‘-q’’ (q30-reads) in Samtools v1.3 (Li et al., 2009). We then used DeDup v0.12.2 to remove PCR duplicates (Peltzer et al., 2016).
To estimate the amount of C to T taphonomic deamination at the ends of the mapped fragments we used mapDamage v2.0 (Jónsson
et al., 2013) run on a subset of 100,000 q30-reads. Exogenous human autosomal DNA contamination was estimated in males by
assessing the X chromosome heterozygosity levels with ANGSD v0.910 (Korneliussen et al., 2014) and mitochondrial DNA
contamination in males and female was estimated via Schmutzi (Renaud et al., 2015; Table S1). PileupCaller (https://github.com/
stschiff/sequenceTools) was used to carry out genotype calling from the q30 reads with the ‘‘–randomHaploid’’ that calls haploid
genotypes by randomly choosing one high quality base (phred base quality score R 30) on the 1240K panel (pseudodiploid calls).
Given that UDG-half treated libraries still preserve a certain amount of C to T deamination at the last 2 bp of the mapped fragments,
the transition alleles were called after masking 2 bp at both ends of the q30 reads with the trimBam module of bamUtil v.1.0.13 (Jun
et al., 2015) while transversions were called on the un-masked reads.
Sequencing data (bam files) were screened for SNPs collected from the Y chromosomal tree of ISOGG version 15.34 (https://isogg.
org/tree/) by Yleaf v1.0 (Ralf et al., 2018), which resulted in Y-haplogroup classification that were checked manually and presented
with defining terminal SNPs in Table S1. Sequences from mitogenome capture were aligned with bwa aln applying parameters -o 1 -e
5 -n 0.01 -l 32, after barcode trimming and merging the fastq files. Mapping quality was ensured by samtools view command and
samtools rmdup removed the PCR duplicates from the bam files. We used Schmutzi to obtain the consensus sequence of the
Mitochondrial DNA and HaploGrep2 (Kloss-Brandstätter et al., 2011) to assign haplogroups (Table S1).

Imputation
The 2-bp masked q30 reads of 66 Avar individuals were used to call genotype likelihoods (GL) with GATK v3.5 (DePristo et al., 2011).
GL were called for all the 29,083,171 biallelic positions with a minor allele count 5 or higher contained in the 1000 Genomes Phase 3
release (1KG; 1000 Genomes Project Consortium et al., 2015) using the ‘‘UnifiedGenotyper’’ module with a mean base quality score
higher than 30 ‘‘-mbq 30’’. These GL were then used for statistical imputation via GeneImp v1.4 software (Spiliopoulou et al., 2017) by
imputing all the 1KG SNPs and using all the 2,504 individuals of the 1KG statistically phased genomes as reference dataset.
Imputation was run independently using three different window lengths ‘‘kl’’ {15, 20, 25}, following the developers’ instructions.
For each imputed SNP the average genotype call probability (GP) across the three different runs was considered and only the
SNPs overlapping the 1240K panel were then finally extracted and merged with the 1240KHO dataset. We first qualitatively
assessed the performance of imputation considering different posterior probability (PP) thresholds for calling a non-missing geno-
type (i.e. 0.9, 0.95, 0.99, 0.999). This was done by comparing the results from the imputed and pseudodiploid data on the
following analyses: projecting them on the PCA calculated on present-day Eurasian populations, f4-statistics in the form f4(Chimp,

e8 Cell 185, 1–12.e1–e11, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

Reference;indX_pseudohap,indX_imputed), runs of homozygosity (Figure S4B) and pairwise mismatch rate (pMMR) using the tool
pMMRCalculator (https://github.com/TCLamnidis/pMMRCalculator) that extends the pMMR formula to consider diploid data with
heterozygous genotypes that could result in a partial match (i.e. 0.5, if one allele matches but not the other) (Figure S2; Table S1).
The 0.9 PP threshold was retained for downstream analyses as it overall gave results more consistent with the respective
pseudodiploid data.
To test for the effect of spurious base calls on imputation, we fit a linear regression model with the mean pMMR for each individual
as the response variable, and the mean coverage and contamination estimate as predictor variables using the R statistical software
package (R Development Core Team, 2021). Since we used contamination estimates from ANGSD, we filter for only male individuals,
and remove any closely related pairs. Using the boxcox function from the MASS package (Venables and Ripley, 2003), we identified
that a log transformation was required. Contamination was found to not be a significant predictor for the mean change in pMMR
estimates (p=0.152), but coverage was found to be significant (p=7.539e-9).

Evaluation of imputation performance


The pMMR calculated on the pseudodiploid or on the diploid imputed should be equivalent for the same pair of individuals. Hence we
developed an algorithm based on the mean difference of the pMMR values to identify an empirical threshold for including or excluding
imputed samples, while simultaneously taking into account the difference in the number of overlapping sites due to sample quality
and imputation.
Consider N individuals with associated, ordered mean coverage c = ðc1 ; .; cN Þ. For individuals j and k let nhjk and nijk , be the
number of overlapping sites, and xjkh and xjki be the number of observed mismatches, for the pseudo-haploid and the
imputed data respectively. If we assume that these mismatch counts follow a binomial distribution, i.e., xjkh  Bðnhjk ; phjk Þ and xjki 
Bðnijk ; pijk Þ, then we can calculate a z-score for the mismatch counts of the form
 
bhjk p
p bijk
zjk = rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi;
 . h
pbjk 1 p bjk njk + nijk

where
. .  . 
bhjk = xjkh nhjk ; p
p bijk = xjki nijk and p
bjk = xjkh + xjki nhjk + nijk :

We then calculate a mean z-score for all pairwise z-scores of the form
2 X
z= zjk ;
NðN 1Þ j;k

with standard deviation sz found via 1000 bootstrap samples.


We then find the smallest value of cm such that if we omit individuals with mean coverage less than or equal to cm , we observe that
 . 
 
z sz <2;

indicating that the pMMR values for the pseudo-haploid and the imputed pairs are not significantly different (Figure S2A).

Genetic relatedness
Genetic relatedness between individuals was assessed by measuring the pairwise mismatch rate (pMMR. i.e. rate of mismatching
alleles) between each pair of individuals (Jeong et al., 2018; Kennett et al., 2017). The pairwise mismatch rate provides an indication of
close genetic relationships, such as identical individuals/twins, first and second degree relatives (Jeong et al., 2018; Kennett et al.,
2017). We also calculated a similar statistic implemented in READ (Monroy Kuhn et al., 2018) that instead uses a windowed approach
that allows to calculate standard errors for the estimated degrees of relatedness, and tests for relationships up to the second degree.
Finally, we also employed lcMLkin (Lipatov et al., 2015), a method that instead uses genotype likelihoods to estimate kinship
parameters, allowing us to differentiate between parent/child and sibling/sibling relationships for first degree relatives (Figure S4C).

Compilation of population genetic data


The genotype data of 66 individuals was merged with a reference genome-wide panel of 2280 modern individuals genotyped with the
microarray technology using the commercial HumanOrigins chip (Jeong et al., 2019; Lazaridis et al., 2016; Patterson et al., 2012) and
previously published ancient individuals’ genotypes obtained with the same 1240K capture sequencing or a pull down of 1,240K sites
from shotgun sequencing data (Allentoft et al., 2015; Amorim et al., 2018; de Barros Damgaard et al., 2018; Damgaard et al., 2018; Fu
et al., 2016; Gnecchi-Ruscone et al., 2021; Jeong et al., 2018, 2019, 2020; Krzewin ska et al., 2018; Lamnidis et al., 2018; Lazaridis
et al., 2016; Li et al., 2018; Mathieson et al., 2015, 2018; McColl et al., 2018; Narasimhan et al., 2019; Ning et al., 2020; Raghavan et al.,

Cell 185, 1–12.e1–e11, April 14, 2022 e9


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

2014; Sikora et al., 2019; Unterländer et al., 2017) downloaded from the Allen Ancient DNA Resource (https://reich.hms.harvard.edu/
allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data) or merged in-house. We formed
two different datasets used for the different analyses as specified below, the "1240K" dataset contaning all the SNPs overlapping
the 1,240K panel and the "1240KHO" containg the SNPs overlapping with the HumanOrigins SNP chip (~600K sites).

Population genetic analysis


Principal component analysis (PCA) was calculated with smartpca v16000 in EIGENSOFT v6.0.1 package (Patterson et al., 2006) on
the 1240KHO dataset using the lsqproject and the autoshrink options to project the genotypes of the ancient individuals (containing
variable amounts of missing data) on top of the principal components calculated on the set of modern populations. PCA was
computed on the set of West to East Eurasian populations excluding South Asians of the 1240KHO dataset as well as on a set of
West Eurasian only groups (comprehending Continental Europe, the Mediterranean area and the Caucasus only; Figures 2 and
S1) for full list of modern individual used see (Gnecchi-Ruscone et al., 2021).
The software qpAdm (v632) of the ADMIXTOOLS package (Patterson et al., 2012) was used to run the f4-statistic based ancestry
analyses on the 1240K dataset (Haak et al., 2015; Reich et al., 2012). Standard errors (SE) for the computed f-statistics were
estimated using a block jackknife with 5 centiMorgan (cM) block. We used the default ‘‘allsnps: NO’’ parameter therefore calculating
all the underlying f4-statistics using the SNP overlap between all groups for each test and excluded three individuals with < 200,000
SNPs covered on the 1240K dataset. Two sets of outgroups were used that included representatives of Eurasian and relevant non-
Eurasian ancient lineages when available otherwise present-day proxies. OG1 includes two Paleolithic human branches (Villabruna
and MA1): Mbuti.DG, Levant_N, Anatolia_N, Iran_N, Villabruna, Onge.DG, Mixe.DG, DevilsCave_N.SG, MA1, Kolyma_M.SG, YR_LN.
OG2 includes more recent representatives for these lineages (Iron_Gates_HG, EHG,Russia_Bolshoy): Mbuti.DG, Levant_N,
Anatolia_N, Iran_N, Iron_Gates_HG, EHG, Onge.DG, Mixe.DG, DevilsCave_N.SG, Russia_Bolshoy, Kolyma_M.SG, YR_LN. We
didn’t observe major differences between the two sets, showing the robustness of the modelings to certain variation in the outgroups.
We did observe cases where OG2 had higher resolution (i.e. higher capacity to distinguish between different combinations of close
proxies) because of the 200,000 SNPs increase in coverage using this outgroup set for each test. We therefore considered OG2 for
the individual-based models for drawing historical inferences. qpAdm modeling on the imputed and local ancestry masked data was
run on the 1240KHO dataset (as the HO modern populations were used to infer ancestry tracts, only the overlapping sites could be
used) and therefore the OG2 outgroup set was slightly modified to include the higher number of individuals available on the HO panel
for the modern populations (HO populations without the.DG suffix): Mbuti, Levant_N, Onge, Iran_N, Iron_Gates_HG, EHG, Mixe,
Anatolia_N, DevilsCave_N.SG, Russia_Bolshoy, Kolyma_M.SG, YR_LN.
In performing the 2- or 3-way individual-based qpAdm modeling reported in Figure 3A we used the following rationale: 1) if the
2-way models on the pseudodiploid data using AR_Xianbei_P_2c + a combination of local and steppe ‘‘western’’ sources has the
resolution to discriminate between local or a steppe source, these results were taken (Figures S1C and S7; Tables S2 and S4).
2) for individuals showing no evident signs of admixture with non-European components, 2-way models contrasting different
local sources were tested (Figure S5B). 3) for the DTI elite individuals that showed a higher degree of homogeneity (genetic and
archaeological) grouped based analyses were also performed and these higher resolution tests were considered in the final selection
(Table S3). 4) for the cases where the grouping would not have been meaningful (e.g. because of the high genetic heterogeneity of the
Transtisza group) or when the group-based models did not completely resolve the admixing sources (i.e. both local and steppe
sources work for the DTI_late_elite), we performed individual based 3-way models including in the same model the competing local
vs steppe sources + AR_Xianbei_P_2c (Figure S1D). For the DTI late case we performed additional group-based tests with two
different subgroups showing opposing signal in the 3-way models (DTI_late_elite1 and DTI_late_elite2). We finally verified these
results based on 2-way individual-based models using the masked local ancestry data (Figure S2B).
DATES v.753 (https://github.com/priyamoorjani/DATES) was used to estimate the dates the eastern steppe admixture events
occurred in the Avar period individuals / populations. This method is based on the same general principle as other admixture dating
methods (Loh et al., 2013; Narasimhan et al., 2019), and assuming a 2-way admixture event (i.e. two admixing sources) it captures the
decay of ancestry covariance coefficients (AC) which can be calculated between every pair of available SNPs over increasing genetic
distance windows (Narasimhan et al., 2019). An exponential function is then fitted to the decay of Weighted AC in order to infer the
number of generations since admixture (Loh et al., 2013). As sources we used the Sarmatians as both pre-Avar and pre-Hunnic
period local population and as eastern steppe proxy we chose the temporary distal but more importantly un-admixed ANA proxy
and well covered LBA/IA group of Ulaanzuukh_SlabGrave in Mongolia (Jeong et al., 2018; Figures 4 and S3).

Haplotype phasing and local ancestry analyses


Haplotype phasing for the imputed ancient data together with the set of worldwide modern populations present in the 1240KHO
dataset was performed using SHAPEIT2 v2.r790 (Delaneau et al., 2013) with default parameters and using HapMap phase 3
recombination maps.
Local ancestry analyses were performed using MOSAIC v1.3 (Salter-Townshend and Myers, 2019). For each analysis group, we
used 132 present-day Eurasian populations as a donor panel and ran MOSAIC assuming two or three ancestry components (-a 2 or
-a 3). Otherwise, we used default parameters. After running MOSAIC, we extracted genomic segments with high posterior probability
for being assigned to each ancestry. For each ancestry component, we took a position from each phased haplotype if the posterior

e10 Cell 185, 1–12.e1–e11, April 14, 2022


Please cite this article in press as: Gnecchi-Ruscone et al., Ancient genomes reveal origin and rapid trans-Eurasian migration of 7th century
Avar elites, Cell (2022), https://doi.org/10.1016/j.cell.2022.03.007

ll
Article

probability from MOSAIC is 0.9 or higher. Then we merged two phased haplotypes from each individual to create ancestry-specific
genotypes for each individual and ancestry component. If both alleles passed the threshold, we took the diploid genotype. If only one
allele passed the threshold (either the other belongs to other ancestry components or it has low posterior probability), we took the
single allele as a pseudo-haploid genotype. This was to maximize per-ancestry genotype coverage.
As an independent validation, local ancestry analysis was inferred with a different method, RFMix v2.03 (Maples et al., 2013) that
identifies chunks of contiguous ancestry of a given phased data using a conditional random field (CRF) parametrized by random
forest and trained on a reference panel of haplotypes from two or three distinct sets of populations. As for Mosaic, phased
1240KHO modern populations from Europe, Caucasus and the Middle-East were used as Western Eurasian reference (WE) while
East Asians and Siberians were used as eastern Eurasian reference (EE). The default 8 generation since admixture was considered
as well as default CRF and random forest window sizes. The most likely assignment of subpopulation per CFR region (the ‘‘.msp.tsv’’
output) was used for each haploid genome to independently mask the regions assigned to EE or WE. The two haploid masked
genomes per each individual were re-collapsed by taking the haploid allele where one region is masked, and taking the diploid
call where both regions are assigned to the same ancestry.
We used the ancestry-specific genotype data to run PCA (Figure S2C) and qpWave/qpAdm (Figure S2B). The qpAdm results from
both local ancestry methods (MOSAIC and RFMix) were consistent with each other and with the main evidence obtained via the
competing 3-way qpAdm models performed on the pseudohaploid data. We otherwise advise caution in interpreting results
supported only by these or similar imputation-haplotype-based methods since their application on ancient data is only at the first
stages of research and have not been fully validated.

Runs of homozygosity
We tested for the presence of inbreeding in the studied individuals by calculating runs of homozygosity (ROH; i.e. long stretches of
homozygous segments along the genome of an individual).
On imputed data we run ROH using the function ‘‘–homozyg’’ implemented in PLINK v. 1.9 (Chang et al., 2015) with the following
parameters ‘‘–homozyg-window-missing 25’’, ‘‘–homozyg-snp 65’’ ‘‘–homozyg-kb 2000’’ on a dataset priorly filtered for minor
allele frequency ‘‘–maf 0.1’’. We also applied hapROH (Ringbauer et al., 2021) a new method that allows estimating ROH from
pseudo-haploid data (1240K dataset). We studied ROH > 2, 4, 8, 12 and 20 cM. Detected ROH blocks above 4cM were plotted using
the python package implemented in hapROH (https://pypi.org/project/hapROH/) (Figure S4A).

Cell 185, 1–12.e1–e11, April 14, 2022 e11


ll
Article

Supplemental figures

(legend on next page)


ll
Article

Figure S1. West Eurasia and Eurasia PCAs and Most relevant qpAdm models of DTI Avar-period elites/elite associated individuals
(A and B) (A) West Eurasian PCA and (B) Eurasian PCA. The symbols and color scheme are the same as in Figure 2.
(C) From left to right qpAdm models for: DTI early Avar period elite individuals; DTI middle Avar period elite individuals; DTI late Avar period elite individuals. Black
boxes highlight the outlier infant (A1817, DTI early Avar period) and child (I18744, DTI middle Avar period).
(D) Three-way competing models of DTI late-Avar-period individuals contrasting local + non-local sources in the same model. A transparency factor is added to
the models presenting poor fits (p < 0.05), related to Figures 1, 2, and 3 and Tables S1 and S2.
ll
Article

(legend on next page)


ll
Article

Figure S2. Summary of imputation quality control and post-imputation analyses


(A) Scatter plot of pairwise mismatch rate for the pseudohaploid data (x axis) versus the pairwise mismatch rate for the imputed data (y axis). Individuals are
filtered to have the proposed coverage cutoff of 1.433. Points are colored by number of overlapping SNPs in each pairwise mismatch rate calculation. The red line
is the y = x line. Transparent points indicate pMMR values for individuals, which were not included because falling below our coverage threshold.
(B) Two-way qpAdm models and PCAs for the West Eurasian component (i.e., masking East Asian ancestry tracts) of DTI late Avar period individuals with Mosaic
(left) and RFMix (right).
(C) West Eurasian (left) and East Eurasian (right) PCAs projecting the masked local ancestry tracts of DTI late individuals performed with both methods (Mosaic
and RFMix) and pseudohaploid data of individuals representative of local (Sarmatian period) and non-local (North_Caucasus_7C) individuals, related to Figure 3
and Table S1.
ll
Article

(legend on next page)


ll
Article

Figure S3. All individual-based Avar-period admixture dates obtained with DATES plotted against Euclidean distance from PC1 and PC2 of
the Rouran-period individual
(A) Late Avar period.
(B) Middle Avar period.
(C) Early Avar period. The Rouran-period genome is also shown with its estimated admixture date and at 0 distance from itself on the x axis.
(D–F) Ancestry covariance decay plot for the group-based analyses (WE = western, EA = eastern sources), related to Figure 4.
ll
Article

(legend on next page)


ll
Article

Figure S4. Assessment of genetic inbreeding and relatedness


(A) hapROH analyses reporting only individuals showing runs of homozygosity (ROH) tracts longer than 4 cM grouped in 4 length bins. On the right, the result of
simulated ROH patterns corresponding to recent inbreeding or small population sizes as provided by the hapROH pipeline.
(B) Comparison between ROH on imputed diploid calls run with PLINK ‘‘–homozyg’’ function and hapROH including also the 2–4-cM bins and only overlapping
individuals for comparison.
(C) Genetic relatedness estimated with READ (left) and lcMLkin (right), related to Figures 1 and 2.
ll
Article

(legend on next page)


ll
Article

Figure S5. qpAdm models for the non-DTI Avar-period individuals and the Sarmatian-period groups
(A) From left to right: two-way models for: early-Avar-period Transtisza group elite associated individuals; early-Avar-period Transtisza group non-elite asso-
ciated individuals; late-Avar-period non-elite associated individuals; early-Avar-period elite associated (Kölked-Feketekapu site) individuals.
(B) Two-way individual based qpAdm models contrasting different local sources for individuals unresolved with the two-way eastern + western proxies’ models.
Models for I6750 and I18185 are still non-optimal as they all have infeasible admixture proportions (>>100% for a single source) and large SE despite some having
p values > 0.05. Nevertheless, Szolad_south_6c as a unique source shows the less deviant models overall.
(C) qpAdm models for the two Sarmatian period groups: LS_P_DTI_4-5c and LS_P_Transtisza_4-5c. LS_P_Transtisza_4-5c can be modeled without any extra
component from the steppe and matches the Szolad_others_6c profile, while LS_P_DTI_4-5c requires additional gene flow from the steppe with different
surrogate proxies providing working models, related to Figure 3 and Tables S2 and S4.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy