phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.
Paul J. McMurdie,Susan Holmes +1 more
TLDR
The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.Abstract:
Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.read more
Citations
More filters
Journal ArticleDOI
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2
Evan Bolyen,Jai Ram Rideout,Matthew R. Dillon,Nicholas A. Bokulich,Christian C. Abnet,Gabriel A. Al-Ghalith,Harriet Alexander,Harriet Alexander,Eric J. Alm,Manimozhiyan Arumugam,Francesco Asnicar,Yang Bai,Jordan E. Bisanz,Kyle Bittinger,Asker Daniel Brejnrod,Colin J. Brislawn,C. Titus Brown,Benjamin J. Callahan,Andrés Mauricio Caraballo-Rodríguez,John Chase,Emily K. Cope,Ricardo Silva,Christian Diener,Pieter C. Dorrestein,Gavin M. Douglas,Daniel M. Durall,Claire Duvallet,Christian F. Edwardson,Madeleine Ernst,Madeleine Ernst,Mehrbod Estaki,Jennifer Fouquier,Julia M. Gauglitz,Sean M. Gibbons,Sean M. Gibbons,Deanna L. Gibson,Antonio Gonzalez,Kestrel Gorlick,Jiarong Guo,Benjamin Hillmann,Susan Holmes,Hannes Holste,Curtis Huttenhower,Curtis Huttenhower,Gavin A. Huttley,Stefan Janssen,Alan K. Jarmusch,Lingjing Jiang,Benjamin D. Kaehler,Benjamin D. Kaehler,Kyo Bin Kang,Kyo Bin Kang,Christopher R. Keefe,Paul Keim,Scott T. Kelley,Dan Knights,Irina Koester,Tomasz Kosciolek,Jorden Kreps,Morgan G. I. Langille,Joslynn S. Lee,Ruth E. Ley,Ruth E. Ley,Yong-Xin Liu,Erikka Loftfield,Catherine A. Lozupone,Massoud Maher,Clarisse Marotz,Bryan D Martin,Daniel McDonald,Lauren J. McIver,Lauren J. McIver,Alexey V. Melnik,Jessica L. Metcalf,Sydney C. Morgan,Jamie Morton,Ahmad Turan Naimey,Jose A. Navas-Molina,Jose A. Navas-Molina,Louis-Félix Nothias,Stephanie B. Orchanian,Talima Pearson,Samuel L. Peoples,Samuel L. Peoples,Daniel Petras,Mary L. Preuss,Elmar Pruesse,Lasse Buur Rasmussen,Adam R. Rivers,Michael S. Robeson,Patrick Rosenthal,Nicola Segata,Michael Shaffer,Arron Shiffer,Rashmi Sinha,Se Jin Song,John R. Spear,Austin D. Swafford,Luke R. Thompson,Luke R. Thompson,Pedro J. Torres,Pauline Trinh,Anupriya Tripathi,Peter J. Turnbaugh,Sabah Ul-Hasan,Justin J. J. van der Hooft,Fernando Vargas,Yoshiki Vázquez-Baeza,Emily Vogtmann,Max von Hippel,William A. Walters,Yunhu Wan,Mingxun Wang,Jonathan Warren,Kyle C. Weber,Kyle C. Weber,Charles H. D. Williamson,Amy D. Willis,Zhenjiang Zech Xu,Jesse R. Zaneveld,Yilong Zhang,Qiyun Zhu,Rob Knight,J. Gregory Caporaso +123 more
TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.
Journal ArticleDOI
ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data
TL;DR: An r package, ggtree, which provides programmable visualization and annotation of phylogenetic trees, which can read more tree file formats than other softwares, and support visualization of phylo, multiphylo, phylo4, phyla4d, obkdata and phyloseq tree objects defined in other r packages.
Journal ArticleDOI
Waste not, want not: why rarefying microbiome data is inadmissible.
Paul J. McMurdie,Susan Holmes +1 more
TL;DR: It is advocated that investigators avoid rarefying altogether and supported statistical theory is provided that simultaneously accounts for library size differences and biological variability using an appropriate mixture model.
Journal ArticleDOI
Microbiome Datasets Are Compositional: And This Is Not Optional.
TL;DR: The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis.
Journal ArticleDOI
Human gut microbes impact host serum metabolome and insulin sensitivity
Helle Krogh Pedersen,Valborg Gudmundsdottir,Henrik Nielsen,Tuulia Hyötyläinen,Tuulia Hyötyläinen,Trine G. Nielsen,Benjamin A. H. Jensen,Kristoffer Forslund,Falk Hildebrand,Falk Hildebrand,Edi Prifti,Edi Prifti,Gwen Falony,Florence Levenez,Joël Doré,Ismo Mattila,Ismo Mattila,Damian R. Plichta,Päivi Pöhö,Päivi Pöhö,Lars Hellgren,Manimozhiyan Arumugam,Shinichi Sunagawa,Sara Vieira-Silva,Torben Jørgensen,Torben Jørgensen,Jacob Bak Holm,Kajetan Trošt,Karsten Kristiansen,Susanne Brix,Jeroen Raes,Jeroen Raes,Jun Wang,Torben Hansen,Torben Hansen,Peer Bork,Søren Brunak,Søren Brunak,Matej Orešič,Matej Orešič,Matej Orešič,S. Dusko Ehrlich,S. Dusko Ehrlich,Oluf Pedersen +43 more
TL;DR: It is shown how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals and suggested that microbial targets may have the potential to diminish insulin resistance and reduce the incidence of common metabolic and cardiovascular disorders.
References
More filters
Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Book
ggplot2: Elegant Graphics for Data Analysis
TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.
Journal ArticleDOI
QIIME allows analysis of high-throughput community sequencing data.
J. Gregory Caporaso,Justin Kuczynski,Jesse Stombaugh,Kyle Bittinger,Frederic D. Bushman,Elizabeth K. Costello,Noah Fierer,Antonio Gonzalez Peña,Julia K. Goodrich,Jeffrey I. Gordon,Gavin A. Huttley,Scott T. Kelley,Dan Knights,Jeremy E. Koenig,Ruth E. Ley,Catherine A. Lozupone,Daniel McDonald,Brian D. Muegge,Meg Pirrung,Jens Reeder,Joel Sevinsky,Peter J. Turnbaugh,William A. Walters,Jeremy Widmann,Tanya Yatsunenko,Jesse R. Zaneveld,Rob Knight,Rob Knight +27 more
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Journal ArticleDOI
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
Patrick D. Schloss,Patrick D. Schloss,Sarah L. Westcott,Sarah L. Westcott,Thomas Ryabin,Justine R. Hall,Martin Hartmann,Emily B. Hollister,Ryan A. Lesniewski,Brian B. Oakley,Donovan H. Parks,Courtney J. Robinson,Jason W. Sahl,Blaz Stres,Gerhard G. Thallinger,David J. Van Horn,Carolyn F. Weber +16 more
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Related Papers (5)
QIIME allows analysis of high-throughput community sequencing data.
J. Gregory Caporaso,Justin Kuczynski,Jesse Stombaugh,Kyle Bittinger,Frederic D. Bushman,Elizabeth K. Costello,Noah Fierer,Antonio Gonzalez Peña,Julia K. Goodrich,Jeffrey I. Gordon,Gavin A. Huttley,Scott T. Kelley,Dan Knights,Jeremy E. Koenig,Ruth E. Ley,Catherine A. Lozupone,Daniel McDonald,Brian D. Muegge,Meg Pirrung,Jens Reeder,Joel Sevinsky,Peter J. Turnbaugh,William A. Walters,Jeremy Widmann,Tanya Yatsunenko,Jesse R. Zaneveld,Rob Knight,Rob Knight +27 more