scispace - formally typeset
Open AccessJournal ArticleDOI

Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data

TLDR
A statistical model for inferring the patterns of population splits and mixtures in multiple populations and it is shown that a simple bifurcating tree does not fully describe the data; in contrast, many migration events are inferred.
Abstract
Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Ancient Admixture in Human History

TL;DR: A suite of methods for learning about population mixtures are presented, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture.
Journal ArticleDOI

fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets

TL;DR: Developing efficient algorithms for approximate inference of the model underlying the STRUCTURE program using a variational Bayesian framework and proposing useful heuristic scores to identify the number of populations represented in a data set and a new hierarchical prior to detect weak population structure in the data.
Journal ArticleDOI

Ancient human genomes suggest three ancestral populations for present-day Europeans

Iosif Lazaridis, +136 more
- 18 Sep 2014 - 
TL;DR: It is shown that most present-day Europeans derive from at least three highly differentiated populations: west European hunter-gatherers, who contributed ancestry to all Europeans but not to Near Easterners; ancient north Eurasians related to Upper Palaeolithic Siberians; and early European farmers, who were mainly of Near Eastern origin but also harboured west Europeanhunter-gatherer related ancestry.
References
More filters
Journal ArticleDOI

The neighbor-joining method: a new method for reconstructing phylogenetic trees.

TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Journal ArticleDOI

Inference of population structure using multilocus genotype data

TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Journal ArticleDOI

Estimation of average heterozygosity and genetic distance from a small number of individuals

TL;DR: It is shown that the number of individuals to be used for estimating average heterozygosity can be very small if a large number of loci are studied and the average heter homozygosity is low.
Journal ArticleDOI

Application of Phylogenetic Networks in Evolutionary Studies

TL;DR: This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted and outlines the beginnings of a comprehensive statistical framework for applying split network methods.
Book

Solving least squares problems

TL;DR: Since the lm function provides a lot of features it is rather complicated so it is going to instead use the function lsfit as a model, which computes only the coefficient estimates and the residuals.
Related Papers (5)