Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data
TLDR
A statistical model for inferring the patterns of population splits and mixtures in multiple populations and it is shown that a simple bifurcating tree does not fully describe the data; in contrast, many migration events are inferred.Abstract:
Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.read more
Citations
More filters
Journal ArticleDOI
Ancient Admixture in Human History
Nick Patterson,Priya Moorjani,Yontao Luo,Swapan Mallick,Nadin Rohland,Yiping Zhan,Teri Genschoreck,Teresa Webster,David Reich,David Reich +9 more
TL;DR: A suite of methods for learning about population mixtures are presented, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture.
Journal ArticleDOI
A high-coverage genome sequence from an archaic Denisovan individual
Matthias Meyer,Martin Kircher,Marie Theres Gansauge,Heng Li,Fernando Racimo,Swapan Mallick,Swapan Mallick,Joshua G. Schraiber,Flora Jay,Kay Prüfer,Cesare de Filippo,Peter H. Sudmant,Can Alkan,Can Alkan,Qiaomei Fu,Qiaomei Fu,Ron Do,Nadin Rohland,Nadin Rohland,Arti Tandon,Arti Tandon,Michael Siebauer,Richard E. Green,Katarzyna Bryc,Adrian W. Briggs,Udo Stenzel,Jesse Dabney,Jay Shendure,Jacob O. Kitzman,Michael F. Hammer,Michael V. Shunkov,A.P. Derevianko,Nick Patterson,Aida M. Andrés,Evan E. Eichler,Evan E. Eichler,Montgomery Slatkin,David Reich,David Reich,Janet Kelso,Svante Pääbo +40 more
TL;DR: The genomic sequence provides evidence for very low rates of heterozygosity in the Denisova, probably not because of recent inbreeding, but instead because of a small population size, and illuminates the relationships between humans and archaics, including Neandertals, and establishes a catalog of genetic changes within the human lineage.
Journal ArticleDOI
Massive migration from the steppe was a source for Indo-European languages in Europe
Wolfgang Haak,Iosif Lazaridis,Nick Patterson,Nadin Rohland,Swapan Mallick,Bastien Llamas,Guido Brandt,Susanne Nordenfelt,Eadaoin Harney,Kristin Stewardson,Qiaomei Fu,Alissa Mittnik,Eszter Bánffy,Christos Economou,Michael Francken,Susanne Friederich,Rafael Garrido Pena,Fredrik Hallgren,Valery Khartanovich,Aleksandr Khokhlov,Michael Kunst,Pavel Kuznetsov,Harald Meller,Oleg Mochalov,Vayacheslav Moiseyev,Nicole Nicklisch,Sandra Pichler,Roberto Risch,Manuel Ángel Rojo Guerra,Christina Roth,Anna Szécsényi-Nagy,Joachim Wahl,Matthias Meyer,Johannes Krause,Dorcas Brown,David W. Anthony,Alan Cooper,Kurt W. Alt,David Reich +38 more
TL;DR: In this paper, the authors generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost 400,000 polymorphisms.
Journal ArticleDOI
fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets
TL;DR: Developing efficient algorithms for approximate inference of the model underlying the STRUCTURE program using a variational Bayesian framework and proposing useful heuristic scores to identify the number of populations represented in a data set and a new hierarchical prior to detect weak population structure in the data.
Journal ArticleDOI
Ancient human genomes suggest three ancestral populations for present-day Europeans
Iosif Lazaridis,Iosif Lazaridis,Nick Patterson,Alissa Mittnik,Gabriel Renaud,Swapan Mallick,Swapan Mallick,Karola Kirsanow,Peter H. Sudmant,Joshua G. Schraiber,Joshua G. Schraiber,Sergi Castellano,Mark Lipson,Bonnie Berger,Bonnie Berger,Christos Economou,Ruth Bollongino,Qiaomei Fu,Kirsten I. Bos,Susanne Nordenfelt,Susanne Nordenfelt,Heng Li,Heng Li,Cesare de Filippo,Kay Prüfer,Susanna Sawyer,Cosimo Posth,Wolfgang Haak,Fredrik Hallgren,Elin Fornander,Nadin Rohland,Nadin Rohland,Dominique Delsate,Michael Francken,Jean-Michel Guinet,Joachim Wahl,George Ayodo,Hamza A. Babiker,Hamza A. Babiker,Graciela Bailliet,Elena Balanovska,Oleg Balanovsky,Ramiro Barrantes,Gabriel Bedoya,Haim Ben-Ami,Judit Bene,Fouad Berrada,Claudio M. Bravi,Francesca Brisighelli,George B.J. Busby,Francesco Calì,Mikhail Churnosov,David E. C. Cole,Daniel Corach,Larissa Damba,George van Driem,Stanislav Dryomov,Jean-Michel Dugoujon,Sardana A. Fedorova,Irene Gallego Romero,Marina Gubina,Michael F. Hammer,Brenna M. Henn,Tor Hervig,Ugur Hodoglugil,Aashish R. Jha,Sena Karachanak-Yankova,Rita Khusainova,Elza Khusnutdinova,Rick A. Kittles,Toomas Kivisild,William Klitz,Vaidutis Kučinskas,Alena Kushniarevich,Leila Laredj,Sergey Litvinov,Theologos Loukidis,Theologos Loukidis,Robert W. Mahley,Béla Melegh,Ene Metspalu,Julio Molina,Joanna L. Mountain,Klemetti Näkkäläjärvi,Desislava Nesheva,Thomas B. Nyambo,Ludmila P. Osipova,Jüri Parik,Fedor Platonov,Olga L. Posukh,Valentino Romano,Francisco Rothhammer,Francisco Rothhammer,Igor Rudan,Ruslan Ruizbakiev,Hovhannes Sahakyan,Hovhannes Sahakyan,Antti Sajantila,Antonio Salas,Elena B. Starikovskaya,Ayele Tarekegn,Draga Toncheva,Shahlo Turdikulova,Ingrida Uktveryte,Olga Utevska,René Vasquez,Mercedes Villena,Mikhail Voevoda,Cheryl A. Winkler,Levon Yepiskoposyan,Pierre Zalloua,Pierre Zalloua,Tatijana Zemunik,Alan Cooper,Cristian Capelli,Mark G. Thomas,Andres Ruiz-Linares,Sarah A. Tishkoff,Lalji Singh,Kumarasamy Thangaraj,Richard Villems,Richard Villems,Richard Villems,David Comas,Rem I. Sukernik,Mait Metspalu,Matthias Meyer,Evan E. Eichler,Joachim Burger,Montgomery Slatkin,Svante Pääbo,Janet Kelso,David Reich,David Reich,David Reich,Johannes Krause,Johannes Krause +136 more
TL;DR: It is shown that most present-day Europeans derive from at least three highly differentiated populations: west European hunter-gatherers, who contributed ancestry to all Europeans but not to Near Easterners; ancient north Eurasians related to Upper Palaeolithic Siberians; and early European farmers, who were mainly of Near Eastern origin but also harboured west Europeanhunter-gatherer related ancestry.
References
More filters
Journal ArticleDOI
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Naruya Saitou,Masatoshi Nei +1 more
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Journal ArticleDOI
Inference of population structure using multilocus genotype data
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Journal ArticleDOI
Estimation of average heterozygosity and genetic distance from a small number of individuals
TL;DR: It is shown that the number of individuals to be used for estimating average heterozygosity can be very small if a large number of loci are studied and the average heter homozygosity is low.
Journal ArticleDOI
Application of Phylogenetic Networks in Evolutionary Studies
Daniel H. Huson,David Bryant +1 more
TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.
Book
Solving least squares problems
TL;DR: Since the lm function provides a lot of features it is rather complicated so it is going to instead use the function lsfit as a model, which computes only the coefficient estimates and the residuals.