scispace - formally typeset
Search or ask a question
JournalISSN: 1753-6561

BMC Proceedings 

BioMed Central
About: BMC Proceedings is an academic journal published by BioMed Central. The journal publishes majorly in the area(s): Population & Gene. It has an ISSN identifier of 1753-6561. It is also open access. Over the lifetime, 2481 publications have been published receiving 15704 citations. The journal is also known as: BioMed Central proceedings.


Papers
More filters
Journal ArticleDOI
TL;DR: The power of the now well established DArT marker platform in combination with Illumina short read sequencing to generate a linkage map for a segregating outcrossed F1 population derived from E. grandis BRASUZ1, the donor of the Eucalyptus reference genome is assessed.
Abstract: Background Wider genome coverage and higher throughput genotyping methods have become increasingly important to meet the resolution and speed necessary for a variety of applications in genomics and molecular breeding of forest trees. Developed more than 10 years ago [1], the Diversity Arrays Technology (DArT) has experienced an increasing interest worldwide for it has efficiently satisfied the requirements of throughput, genome coverage and inter-specific transferability for over 40 different plant species to date, including Eucalyptus[2] and recently Pinus (Dione Alves-Freitas, this meeting). DArT is based on genome complexity reduction using restriction enzymes, followed by hybridization to microarrays to simultaneously assay hundreds to thousands of markers across a genome. Genome complexity reduction for genotyping has now been taken to another level when combined to next generation sequencing (NGS) technologies. Such a strategy has been used for rapid SNP discovery in different organisms [3], and proposed as a way to genotype with RAD (Restriction-associated DNA) sequencing [4]and recently by a similar method generally termed GbS (Genotyping-by-Sequencing)[5]. In this work we assessed the power of the now well established DArT marker platform in combination with Illumina short read sequencing to generate a linkage map for a segregating outcrossed F1 population derived from E. grandis BRASUZ1, the donor of the Eucalyptus reference genome.

298 citations

Journal ArticleDOI
TL;DR: The elastic net, lasso, adaptive lasso and the adaptive elastic net all had similar accuracies but outperformed ridge regression and ridge regression BLUP in terms of the Pearson correlation between predicted GEBVs and the true genomic value as well as the root mean squared error.
Abstract: Background Genomic selection (GS) is emerging as an efficient and cost-effective method for estimating breeding values using molecular markers distributed over the entire genome. In essence, it involves estimating the simultaneous effects of all genes or chromosomal segments and combining the estimates to predict the total genomic breeding value (GEBV). Accurate prediction of GEBVs is a central and recurring challenge in plant and animal breeding. The existence of a bewildering array of approaches for predicting breeding values using markers underscores the importance of identifying approaches able to efficiently and accurately predict breeding values. Here, we comparatively evaluate the predictive performance of six regularized linear regression methods-- ridge regression, ridge regression BLUP, lasso, adaptive lasso, elastic net and adaptive elastic net-- for predicting GEBV using dense SNP markers.

274 citations

Journal ArticleDOI
TL;DR: The predictive accuracy of random forests, stochastic gradient boosting (boosting) and support vector machines (SVMs) for predicting genomic breeding values using dense SNP markers was evaluated and the utility of RF for ranking the predictive importance of markers for pre-screening markers or discovering chromosomal locations of QTLs was explored.
Abstract: Genomic selection (GS) involves estimating breeding values using molecular markers spanning the entire genome. Accurate prediction of genomic breeding values (GEBVs) presents a central challenge to contemporary plant and animal breeders. The existence of a wide array of marker-based approaches for predicting breeding values makes it essential to evaluate and compare their relative predictive performances to identify approaches able to accurately predict breeding values. We evaluated the predictive accuracy of random forests (RF), stochastic gradient boosting (boosting) and support vector machines (SVMs) for predicting genomic breeding values using dense SNP markers and explored the utility of RF for ranking the predictive importance of markers for pre-screening markers or discovering chromosomal locations of QTLs. We predicted GEBVs for one quantitative trait in a dataset simulated for the QTLMAS 2010 workshop. Predictive accuracy was measured as the Pearson correlation between GEBVs and observed values using 5-fold cross-validation and between predicted and true breeding values. The importance of each marker was ranked using RF and plotted against the position of the marker and associated QTLs on one of five simulated chromosomes. The correlations between the predicted and true breeding values were 0.547 for boosting, 0.497 for SVMs, and 0.483 for RF, indicating better performance for boosting than for SVMs and RF. Accuracy was highest for boosting, intermediate for SVMs and lowest for RF but differed little among the three methods and relative to ridge regression BLUP (RR-BLUP).

210 citations

Journal ArticleDOI
TL;DR: The data set simulated for Genetic Analysis Workshop 17 was designed to mimic a subset of data that might be produced in a full exome screen for a complex disorder and related risk factors in order to permit workshop participants to investigate issues of study design and statistical genetic analysis.
Abstract: The data set simulated for Genetic Analysis Workshop 17 was designed to mimic a subset of data that might be produced in a full exome screen for a complex disorder and related risk factors in order to permit workshop participants to investigate issues of study design and statistical genetic analysis. Real sequence data from the 1000 Genomes Project formed the basis for simulating a common disease trait with a prevalence of 30% and three related quantitative risk factors in a sample of 697 unrelated individuals and a second sample of 697 individuals in large, extended pedigrees. Called genotypes for 24,487 autosomal markers assigned to 3,205 genes and simulated affection status, quantitative traits, age, sex, pedigree relationships, and cigarette smoking were provided to workshop participants. The simulating model included both common and rare variants with minor allele frequencies ranging from 0.07% to 25.8% and a wide range of effect sizes for these variants. Genotype-smoking interaction effects were included for variants in one gene. Functional variants were concentrated in genes selected from specific biological pathways and were selected on the basis of the predicted deleteriousness of the coding change. For each sample, unrelated individuals and family, 200 replicates of the phenotypes were simulated.

172 citations

Journal ArticleDOI
TL;DR: Two pathways analysis tools, ArrayUnlock and Ingenuity Pathways Analysis (IPA) are described to deal with the post-analyses of microarray data, in the context of the EADGENE and SABRE post-analysis workshop.
Abstract: Once a list of differentially expressed genes has been identified from a microarray experiment, a subsequent post-analysis task is required in order to find the main biological processes associated to the experimental system. This paper describes two pathways analysis tools, ArrayUnlock and Ingenuity Pathways Analysis (IPA) to deal with the post-analyses of microarray data, in the context of the EADGENE and SABRE post-analysis workshop. Dataset employed in this study proceeded from an experimental chicken infection performed to study the host reactions after a homologous or heterologous secondary challenge with two species of Eimeria. Analysis of the same microarray data source employing both commercial pathway analysis tools in parallel let to identify several biological and/or molecular functions altered in the chicken Eimeria maxima infection model, including several immune system related pathways. Biological functions differentially altered in the homologous and heterologous second infection were identified. Similarly, the effect of the timing in a homologous second infection was characterized by several biological functions. Functional analysis with ArrayUnlock and IPA provided information related to functional differences with the three comparisons of the chicken infection leading to similar conclusions. ArrayUnlock let an improvement of the annotations of the chicken genome adding InterPro annotations to the data set file. IPA provides two powerful tools to understand the pathway analysis results: the networks and canonical pathways that showed several pathways related to an adaptative immune response.

151 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202314
20227
202119
202016
201914
201860