Population structure and eigenanalysis
TLDR
An approach to studying population structure (principal components analysis) is discussed that was first applied to genetic data by Cavalli-Sforza and colleagues, and results from modern statistics are used to develop formal significance tests for population differentiation.Abstract:
Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by Cavalli-Sforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general “phase change” phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like FST) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure.read more
Citations
More filters
Journal ArticleDOI
Fast model-based estimation of ancestry in unrelated individuals
TL;DR: The results show that ADMIXTURE's computational speed opens up the possibility of using a much larger set of markers in model-based ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies.
Journal ArticleDOI
A second generation human haplotype map of over 3.1 million SNPs
Kelly A. Frazer,Dennis G. Ballinger,David R. Cox,David A. Hinds,Laura L. Stuve,Richard A. Gibbs,John W. Belmont,Andrew Boudreau,Paul Hardenbol,Suzanne M. Leal,Shiran Pasternak,David A. Wheeler,Thomas D. Willis,Fuli Yu,Huanming Yang,Changqing Zeng,Gao Yang,H. B. Hu,Weitao Hu,Chaohua Li,Wei Lin,Siqi Liu,Hao Pan,Xiaoli Tang,Jian Wang,Wei Wang,Jun Yu,Bo Zhang,Qingrun Zhang,Hongbin Zhao,Hui Zhao,Jun Zhou,Stacey Gabriel,Rachel Barry,Brendan Blumenstiel,Amy L. Camargo,Matthew Defelice,Maura Faggart,Mary Goyette,Supriya Gupta,Jamie Moore,Huy Nguyen,Robert C. Onofrio,Melissa Parkin,Jessica Roy,Erich Stahl,Ellen Winchester,Liuda Ziaugra,David Altshuler,Yan Shen,Zhijian Yao,Wei Huang,Xun Chu,Yungang He,Li Jin,Yangfan Liu,Yayun Shen,Weiwei Sun,Haifeng Wang,Yi Wang,Ying Wang,Xiaoyan Xiong,Liang Xu,Mary M.Y. Waye,Stephen Kwok-Wing Tsui,Hong Xue,J. Tze Fei Wong,Luana Galver,Jian-Bing Fan,Kevin L. Gunderson,Sarah S. Murray,Arnold Oliphant,Mark S. Chee,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Jean François Olivier,Michael S. Phillips,Stéphanie Roumy,Clémentine Sallée,Andrei Verner,Thomas J. Hudson,Pui-Yan Kwok,Dongmei Cai,Daniel C. Koboldt,Raymond D. Miller,Ludmila Pawlikowska,Patricia Taillon-Miller,Ming Xiao,Lap-Chee Tsui,William Mak,Qiang Song You,Paul K.H. Tam,Yusuke Nakamura,Takahisa Kawaguchi,Takuya Kitamoto,Takashi Morizono,Atsushi Nagashima,Yozo Ohnishi,Akihiro Sekine,Toshihiro Tanaka,Tatsuhiko Tsunoda,Panos Deloukas,Christine P. Bird,Marcos Delgado,Emmanouil T. Dermitzakis,Rhian Gwilliam,Sarah E. Hunt,Jonathan J. Morrison,Don Powell,Barbara E. Stranger,Pamela Whittaker,David R. Bentley,Mark J. Daly,Paul I.W. de Bakker,Jeffrey C. Barrett,Yves Chretien,Julian Maller,Steve McCarroll,Nick Patterson,Itsik Pe'er,Alkes L. Price,Shaun Purcell,Daniel J. Richter,Pardis C. Sabeti,Richa Saxena,Stephen F. Schaffner,Pak C. Sham,Patrick Varilly,Lincoln Stein,Lalitha Krishnan,Albert V. Smith,Marcela K. Tello-Ruiz,Gudmundur A. Thorisson,Aravinda Chakravarti,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Shin Lin,Gonçalo R. Abecasis,Weihua Guan,Yun Li,Heather M. Munro,Zhaohui S. Qin,Daryl J. Thomas,Gilean McVean,Adam Auton,Leonardo Bottolo,Niall Cardin,Susana Eyheramendy,Colin Freeman,Jonathan Marchini,Simon Myers,Chris C. A. Spencer,Matthew Stephens,Peter Donnelly,Lon R. Cardon,Geraldine M. Clarke,David M. Evans,Andrew P. Morris,Bruce S. Weir,Todd A. Johnson,James C. Mullikin,Stephen T. Sherry,Michael Feolo,Andrew D. Skol,Houcan Zhang,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Ike Ajayi,Toyin Aniagwu,Patricia A. Marshall,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Andy Peiffer,Renzong Qiu,Alastair Kent,Kazuto Kato,Norio Niikawa,Isaac F. Adewole,Bartha Maria Knoppers,Morris W. Foster,Ellen Wright Clayton,Jessica Watkin,Donna M. Muzny,Lynne V. Nazareth,Erica Sodergren,George M. Weinstock,Imtaz Yakub,Bruce W. Birren,Richard K. Wilson,Lucinda Fulton,Jane Rogers,John Burton,Nigel P. Carter,C M Clee,Mark Griffiths,Matthew C. Jones,Kirsten McLay,Robert W. Plumb,Mark T. Ross,Sarah Sims,David Willey,Zhu Chen,Hua Han,Le Kang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Koji Saeki,Hongguang Wang,Daochang An,Hongbo Fu,Qing Li,Zhen Wang,Renwu Wang,Arthur L. Holden,Lisa D. Brooks,Jean E. McEwen,Mark S. Guyer,Vivian Ota Wang,Jane Peterson,Michael Shi,Jack Spiegel,Lawrence M. Sung,Lynn F. Zacharia,Francis S. Collins,Karen Kennedy,Ruth Jamieson,John Stewart +237 more
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Journal ArticleDOI
Discriminant analysis of principal components: a new method for the analysis of genetically structured populations
TL;DR: The Discriminant Analysis of Principal Components (DAPC) is introduced, a multivariate method designed to identify and describe clusters of genetically related individuals that performs generally better than STRUCTURE at characterizing population subdivision.
Journal ArticleDOI
Common SNPs explain a large proportion of the heritability for human height
Jian Yang,Beben Benyamin,Brian P. McEvoy,Scott D. Gordon,Anjali K. Henders,Dale R. Nyholt,Pamela A. F. Madden,Andrew C. Heath,Nicholas G. Martin,Grant W. Montgomery,Michael E. Goddard,Peter M. Visscher +11 more
TL;DR: Evidence is provided that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.
Journal ArticleDOI
Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease
Jean-Charles Lambert,Jean-Charles Lambert,Jean-Charles Lambert,Carla A. Ibrahim-Verbaas,Denise Harold,Adam C. Naj,Rebecca Sims,Céline Bellenguez,Céline Bellenguez,Céline Bellenguez,Gyungah Jun,Anita L. DeStefano,Joshua C. Bis,Gary W. Beecham,Benjamin Grenier-Boley,Benjamin Grenier-Boley,Benjamin Grenier-Boley,Giancarlo Russo,Tricia A. Thornton-Wells,Nicola Jones,Albert V. Smith,Vincent Chouraki,Vincent Chouraki,Vincent Chouraki,Charlene Thomas,M. Arfan Ikram,Diana Zelenika,Badri N. Vardarajan,Yoichiro Kamatani,Chiao-Feng Lin,Amy Gerrish,Helena Schmidt,Brian W. Kunkle,Melanie L. Dunstan,Agustín Ruiz,Marie-Thérèse Bihoreau,Seung Hoan Choi,Christiane Reitz,Florence Pasquier,Paul Hollingworth,Alfredo Ramirez,Olivier Hanon,Annette L. Fitzpatrick,Joseph D. Buxbaum,Dominique Campion,Paul K. Crane,Clinton T. Baldwin,Tim Becker,Tim Becker,Vilmundur Gudnason,Carlos Cruchaga,David Craig,Najaf Amin,Claudine Berr,Oscar L. Lopez,Philip L. De Jager,Philip L. De Jager,Vincent Deramecourt,Janet A. Johnston,Denis A. Evans,Simon Lovestone,Luc Letenneur,Francisco J. Morón,David C. Rubinsztein,Gudny Eiriksdottir,Kristel Sleegers,Kristel Sleegers,Alison Goate,Nathalie Fievet,Nathalie Fievet,Matthew J. Huentelman,Michael Gill,Kristelle Brown,M. Ilyas Kamboh,Lina Keller,Pascale Barberger-Gateau,Bernadette McGuinness,Eric B. Larson,Eric B. Larson,Robert C. Green,Amanda J. Myers,Carole Dufouil,Stephen Todd,David Wallon,Seth Love,Ekaterina Rogaeva,John Gallacher,Peter St George-Hyslop,Peter St George-Hyslop,Jordi Clarimón,Alberto Lleó,Anthony Bayer,Debby W. Tsuang,Lei Yu,Magda Tsolaki,Paola Bossù,Gianfranco Spalletta,Petroula Proitsi,John Collinge,Sandro Sorbi,Florentino Sanchez-Garcia,Nick C. Fox,John Hardy,Maria Candida Deniz Naranjo,Paolo Bosco,Robert Clarke,Carol Brayne,Daniela Galimberti,Michelangelo Mancuso,Fiona E. Matthews,Genetic,Environmental Risk in Alzheimer's Disease,Environmental Risk in Alzheimer's Disease,Cohorts for Heart,Cohorts for Heart,Susanne Moebus,Patrizia Mecocci,Maria Del Zompo,Wolfgang Maier,Wolfgang Maier,Harald Hampel,Harald Hampel,Alberto Pilotto,María J. Bullido,María J. Bullido,Francesco Panza,Paolo Caffarra,Paolo Caffarra,Benedetta Nacmias,John R. Gilbert,Manuel Mayhaus,Lars Lannfelt,Hakon Hakonarson,Sabrina Pichler,Minerva M. Carrasquillo,Martin Ingelsson,Duane Beekly,Victoria Alvarez,Fanggeng Zou,Otto Valladares,Steven G. Younkin,Eliecer Coto,Kara L. Hamilton-Nelson,Wei Gu,Cristina Razquin,Pau Pastor,Ignacio Mateo,Michael John Owen,Kelley Faber,Palmi V. Jonsson,Onofre Combarros,Michael Conlon O'Donovan,Laura B. Cantwell,Hilkka Soininen,Deborah Blacker,Simon Mead,Thomas H. Mosley,David A. Bennett,Tamara B. Harris,Laura Fratiglioni,Laura Fratiglioni,Clive Holmes,Renée F.A.G. de Bruijn,Peter Passmore,Thomas J. Montine,Karolien Bettens,Karolien Bettens,Jerome I. Rotter,Alexis Brice,Alexis Brice,Kevin Morgan,Tatiana Foroud,Walter A. Kukull,Didier Hannequin,John Powell,Mike A. Nalls,Karen Ritchie,Kathryn L. Lunetta,John S. K. Kauwe,Eric Boerwinkle,Eric Boerwinkle,Matthias Riemenschneider,Mercè Boada,Mikko Hiltunen,Eden R. Martin,Reinhold Schmidt,Dan Rujescu,Li-San Wang,Jean-François Dartigues,Jean-François Dartigues,Richard Mayeux,Christophe Tzourio,Albert Hofman,Markus M. Nöthen,Caroline Graff,Caroline Graff,Bruce M. Psaty,Bruce M. Psaty,Lesley Jones,Jonathan L. Haines,Peter Holmans,Mark Lathrop,Mark Lathrop,Margaret A. Pericak-Vance,Lenore J. Launer,Lindsay A. Farrer,Cornelia M. van Duijn,Christine Van Broeckhoven,Christine Van Broeckhoven,Valentina Moskvina,Sudha Seshadri,Julie Williams,Gerard D. Schellenberg,Philippe Amouyel,Philippe Amouyel,Philippe Amouyel +215 more
TL;DR: In addition to the APOE locus (encoding apolipoprotein E), 19 loci reached genome-wide significance (P < 5 × 10−8) in the combined stage 1 and stage 2 analysis, of which 11 are newly associated with Alzheimer's disease.
References
More filters
Journal ArticleDOI
Inference of population structure using multilocus genotype data
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Journal ArticleDOI
Principal components analysis corrects for stratification in genome-wide association studies
Alkes L. Price,Alkes L. Price,Nick Patterson,Robert M. Plenge,Robert M. Plenge,Michael E. Weinblatt,Nancy A. Shadick,David Reich,David Reich +8 more
TL;DR: This work describes a method that enables explicit detection and correction of population stratification on a genome-wide scale and uses principal components analysis to explicitly model ancestry differences between cases and controls.
Journal ArticleDOI
Inference of Population Structure Using Multilocus Genotype Data: Linked Loci and Correlated Allele Frequencies
TL;DR: Extensions to the method of Pritchard et al. for inferring population structure from multilocus genotype data are described and methods that allow for linkage between loci are developed, which allows identification of subtle population subdivisions that were not detectable using the existing method.
Journal ArticleDOI
The International HapMap Project
John W. Belmont,Paul Hardenbol,Thomas D. Willis,Fuli Yu,Huanming Yang,Lan Yang Ch'Ang,Wei Huang,Bin Liu,Yan Shen,Paul K.H. Tam,Lap-Chee Tsui,Mary M.Y. Waye,Jeffrey Tze Fei Wong,Changqing Zeng,Qingrun Zhang,Mark S. Chee,Luana Galver,Semyon Kruglyak,Sarah S. Murray,Arnold Oliphant,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Michael S. Phillips,Andrei Verner,Shenghui Duan,Denise L. Lind,Raymond D. Miller,John P. Rice,Nancy L. Saccone,Patricia Taillon-Miller,Ming Xiao,Akihiro Sekine,Koki Sorimachi,Yoichi Tanaka,Tatsuhiko Tsunoda,Eiji Yoshino,David R. Bentley,Sarah E. Hunt,Don Powell,Houcan Zhang,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Toyin Aniagwu,Patricia A. Marshall,Olayemi Matthew,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Fiona Cunningham,Ardavan Kanani,Gudmundur A. Thorisson,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Peter Donnelly,Jonathan Marchini,Gilean McVean,Simon Myers,Lon R. Cardon,Andrew P. Morris,Bruce S. Weir,James C. Mullikin,Michael Feolo,Mark J. Daly,Renzong Qiu,Alastair Kent,Georgia M. Dunston,Kazuto Kato,Norio Niikawa,Jessica Watkin,Richard A. Gibbs,Erica Sodergren,George M. Weinstock,Richard K. Wilson,Lucinda Fulton,Jane Rogers,Bruce W. Birren,Hua Han,Hongguang Wang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Kazuo Todani,Takashi Fujita,Satoshi Tanaka,Arthur L. Holden,Francis S. Collins,Lisa D. Brooks,Jean E. McEwen,Mark S. Guyer,Elke Jordan,Jane Peterson,Jack Spiegel,Lawrence M. Sung,Lynn F. Zacharia,Karen Kennedy,Michael Dunn,Richard Seabrook,Mark Shillito,Barbara Skene,John Stewart,David Valle,Ellen Wright Clayton,Lynn B. Jorde,Aravinda Chakravarti,Mildred K. Cho,Troy Duster,Troy Duster,Morris W. Foster,Maria Jasperse,Bartha Maria Knoppers,Pui-Yan Kwok,Julio Licinio,Jeffrey C. Long,Pilar N. Ossorio,Vivian Ota Wang,Charles N. Rotimi,Patricia Spallone,Patricia Spallone,Sharon F. Terry,Eric S. Lander,Eric H. Lai,Deborah A. Nickerson,Gonçalo R. Abecasis,David Altshuler,Michael Boehnke,Panos Deloukas,Julie A. Douglas,Stacey Gabriel,Richard R. Hudson,Thomas J. Hudson,Leonid Kruglyak,Yusuke Nakamura,Robert L. Nussbaum,Stephen F. Schaffner,Stephen T. Sherry,Lincoln Stein,Toshihiro Tanaka +145 more
TL;DR: The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance the ability to choose targets for therapeutic intervention.
Journal ArticleDOI
A haplotype map of the human genome
John W. Belmont,Andrew Boudreau,Suzanne M. Leal,Paul Hardenbol,Shiran Pasternak,David A. Wheeler,Thomas D. Willis,Fuli Yu,Huanming Yang,Gao Yang,H. B. Hu,Weitao Hu,Chaohua Li,Wei Lin,Siqi Liu,Hao Pan,Xiaoli Tang,Jian Wang,Wei Wang,Jun Yu,Bo Zhang,Qingrun Zhang,Hongbin Zhao,Jun Zhou,Rachel Barry,Brendan Blumenstiel,Amy L. Camargo,Matthew Defelice,Maura Faggart,Mary Goyette,Supriya Gupta,Jamie Moore,Huy Nguyen,Melissa Parkin,Jessica Roy,Erich Stahl,Ellen Winchester,David Altshuler,Yan Shen,Zhijian Yao,Wei Huang,Xun Chu,Yungang He,Li Jin,Yangfan Liu,Yayun Shen,Weiwei Sun,Haifeng Wang,Yi Wang,Ying Wang,Xiaoyan Xiong,Liang Xu,Mary M.Y. Waye,Stephen Kwok-Wing Tsui,Hong Xue,J. Tze Fei Wong,Launa M. Galver,Jian-Bing Fan,Sarah S. Murray,Arnold Oliphant,Mark S. Chee,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Jean François Olivier,Michael S. Phillips,Stéphanie Roumy,Clémentine Sallée,Andrei Verner,Thomas J. Hudson,Kelly A. Frazer,Dennis G. Ballinger,David R. Cox,David A. Hinds,Laura L. Stuve,Pui-Yan Kwok,Dongmei Cai,Daniel C. Koboldt,Raymond D. Miller,Ludmila Pawlikowska,Patricia Taillon-Miller,Ming Xiao,Lap-Chee Tsui,William Mak,Pak C. Sham,You-Qiang Song,Paul K.H. Tam,Yusuke Nakamura,Takahisa Kawaguchi,Takuya Kitamoto,Takashi Morizono,Atsushi Nagashima,Yozo Ohnishi,Akihiro Sekine,Toshihiro Tanaka,Panos Deloukas,Christine P. Bird,Marcos Delgado,Emmanouil T. Dermitzakis,Rhian Gwilliam,Sarah E. Hunt,Jonathan Morrison,Don Powell,Barbara E. Stranger,Pamela Whittaker,David R. Bentley,Paul I.W. de Bakker,Jeffrey C. Barrett,Ben Fry,Julian Maller,Steve McCarroll,Nick Patterson,Itsik Pe'er,Shaun Purcell,Daniel J. Richter,Pardis C. Sabeti,Richa Saxena,Stephen F. Schaffner,Patrick Varilly,Lincoln Stein,Lalitha Krishnan,Albert V. Smith,Gudmundur A. Thorisson,Aravinda Chakravarti,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Shin Lin,Gonçalo R. Abecasis,Weihua Guan,Heather M. Munro,Zhaohui S. Qin,Daryl J. Thomas,Gilean McVean,Leonardo Bottolo,Susana Eyheramendy,Colin Freeman,Jonathan Marchini,Simon Myers,Chris C. A. Spencer,Matthew Stephens,Peter Donnelly,Lon R. Cardon,Geraldine M. Clarke,David M. Evans,Andrew P. Morris,Bruce S. Weir,Tatsuhiko Tsunoda,James C. Mullikin,Stephen T. Sherry,Michael Feolo,Houcan Zhang,Changqing Zeng,Hui Zhao,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Ike Ajayi,Toyin Aniagwu,Patricia A. Marshall,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Andy Peiffer,Renzong Qiu,Alastair Kent,Kazuto Kato,Norio Niikawa,Isaac F. Adewole,Bartha Maria Knoppers,Morris W. Foster,Ellen Wright Clayton,Jessica Watkin,Richard A. Gibbs,Donna M. Muzny,Lynne V. Nazareth,Erica Sodergren,George M. Weinstock,Imtiaz Yakub,Stacey Gabriel,Robert C. Onofrio,Liuda Ziaugra,Bruce W. Birren,Mark J. Daly,Richard K. Wilson,Lucinda Fulton,Jane Rogers,John Burton,Nigel P. Carter,C M Clee,Mark Griffiths,Matthew C. Jones,Kirsten McLay,Robert W. Plumb,Mark T. Ross,Sarah Sims,David Willey,Zhu Chen,Hua Han,L. Kang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Koji Saeki,Hongguang Wang,Daochang An,Hongbo Fu,Qing Li,Zhen Wang,Renwu Wang,Arthur L. Holden,Lisa D. Brooks,Jean E. McEwen,Christianne R. Bird,Mark S. Guyer,Patrick J. Nailer,Vivian Ota Wang,Jane Peterson,Michael Shi,Jack Spiegel,Lawrence M. Sung,Jonathan Witonsky,Lynn F. Zacharia,Francis S. Collins,Karen Kennedy,Ruth Jamieson,John Stewart +232 more
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.