Online citations, reference lists, and bibliographies.
← Back to Search

Extensions Of BLUP Models For Genomic Prediction In Heterogeneous Populations: Application In A Diverse Switchgrass Sample

Guillaume P Ramstein, M. D. Casler
Published 2017 · Biology

Cite This
Download PDF
Analyze on Scholarcy
Genomic prediction is a useful tool to accelerate genetic gain in selection using DNA marker information. However, this technology usually relies on models that are not designed to accommodate population heterogeneity, which results from differences in marker effects across genetic backgrounds. Previous studies have proposed to cope with population heterogeneity using diverse approaches: (i) either ignoring it, therefore relying on the robustness of standard approaches; (ii) reducing it, by selecting homogenous subsets of individuals in the sample; or (iii) modelling it by using interactive models. In this study we assessed all three possible approaches, applying existing and novel procedures for each of them. All procedures developed are based on deterministic optimizations, can account for heteroscedasticity, and are applicable in contexts of admixed populations. In a case study on a diverse switchgrass sample, we compared the procedures to a control where predictions rely on homogeneous subsamples. Ignoring heterogeneity was often not detrimental, and sometimes beneficial, to prediction accuracy, compared to the control. Reducing heterogeneity did not result in further increases in accuracy. However, in scenarios of limited subsample sizes, a novel procedure, which accounted for redundancy within subsamples, outperformed the existing procedure, which only considered relationships to selection candidates. Modelling heterogeneity resulted in substantial increases in accuracy, in the cases where accounting for population heterogeneity yielded a highly significant improvement in fit. Our study exemplifies advantages and limits of the various approaches that are promising in various contexts of population heterogeneity, e.g. prediction based on historical datasets or dynamic breeding.
This paper references
Comparison of joint versus purebred genomic evaluation in the French multi-breed dairy goat population
C. Carillier (2014)
Principles and procedures of statistics: A biometrical approach
G. J. Mitchell (1981)
Registration of WS4U and WS8U Switchgrass Germplasms
M. Casler (2006)
Reliability of direct genomic values for animals with different relationships within and to the reference population.
M. Pszczola (2012)
On the distribution of the largest eigenvalue in principal components analysis
I. Johnstone (2001)
SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies
E. Martin (2010)
Extended Bayesian Information Criteria for Gaussian Graphical Models
Rina Foygel (2010)
Switchgrass as a sustainable bioenergy crop
M. Sanderson (1996)
agricultural, biological, and environmental statistics
A. De Roos (2009)
Enhancements to the ADMIXTURE algorithm for individual ancestry estimation
D. Alexander (2011)
Maximizing the Reliability of Genomic Selection by Optimizing the Calibration Set of Reference Individuals: Comparison of Methods in Two Diverse Groups of Maize Inbreds (Zea mays L.)
R. Rincent (2012)
The use of unbalanced historical data for genomic selection in an international wheat breeding program
J. Dawson (2013)
Genome-wide patterns of population structure and admixture in West Africans and African Americans
K. Bryc (2009)
ASReml-R reference manual. The State of Queensland, Department of Primary Industries and Fisheries
D. Butler (2009)
Should genetic groups be fitted in BLUP evaluation? Practical answer for the French AI beef sire evaluation
F. Phocas (2003)
Novel strategies to minimize progeny inbreeding while maximizing genetic gain using genomic information.
J. Pryce (2012)
Nucleotide polymorphism and copy number variant detection using exome capture and next-generation sequencing in the polyploid grass Panicum virgatum
J. Evans (2014)
Incorporating Genetic Heterogeneity in Whole-Genome Regressions Using Interactions
G. de los Campos (2015)
Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus.
D. Lourenco (2015)
Improved heritability estimation from genome-wide SNPs.
D. Speed (2012)
Population genomic variation reveals roles of history, adaptation and ploidy in switchgrass
Paul P Grabowski (2014)
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
Jianqing Fan (2013)
Predicting Developmental Morphology in Switchgrass and Big Bluestem
R. Mitchell (1997)
Fast model-based estimation of ancestry in unrelated individuals.
D. Alexander (2009)
Mean and range refer to the means yi's as described in Material and Methods. Units for mean and range are centimeter, growing degree days and scores on a 0-10 scale, for
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels.
M. Erbe (2012)
Estimating kinship in admixed populations.
T. Thornton (2012)
Linkage analysis without defined pedigrees
A. Day-Williams (2011)
Biomass as Feedstock for A Bioenergy and Bioproducts Industry: The Technical Feasibility of a Billion-Ton Annual Supply
R. D. Perlack (2005)
Genomic Prediction of Northern Corn Leaf Blight Resistance in Maize with Combined or Separated Training Sets for Heterotic Groups
F. Technow (2013)
genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.)
A. J. Rothman (2010)
Genomic Selection in Plant Breeding. Knowledge and Prospects.
A. Lorenz (2011)
Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
G. de los Campos (2013)
Invited review: reliability of genomic predictions for North American Holstein bulls.
P. VanRaden (2009)
Accuracy of genotypic value predictions for marker-based selection in biparental plant populations
Robenzon E. Lorenzana (2009)
2016), in a study where they also recommended using principal components from a subset of unrelated individuals in X. Here, we simply applied PCA on the whole matrix X
Conomos (2016)
Accuracy of genotypic value predictions for markerbased selection in biparental plant populations. Theoretical and applied genetics
R. E. Lorenzana (2009)
Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models
C. Lehermeier (2015)
Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (Secale cereale L.)
Hans-Jürgen Auinger (2016)
Optimization of genomic selection training populations with a genetic algorithm
D. Akdemir (2015)
Accelerating the Switchgrass (Panicum virgatum L.) Breeding Cycle Using Genomic Selection Approaches
A. Lipka (2014)
Covariance regularization by thresholding
P. Bickel (2008)
Computationally Efficient Sibship and Parentage Assignment from Multilocus Marker Data
J. Wang (2012)
Admixture in Latin America: Geographic Structure, Phenotypic Diversity and Self-Perception of Ancestry Based on 7,342 Individuals
A. Ruiz-Linares (2014)
Genomics and the future of conservation genetics
F. Allendorf (2010)
Pattern recognition and machine learning: Springer
C. M. Bishop (2006)
Accuracy of Genomic Prediction in Switchgrass (Panicum virgatum L.) Improved by Accounting for Linkage Disequilibrium
Guillaume P Ramstein (2016)
Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP
Jeffrey B. Endelman (2011)
Increased accuracy of artificial selection by using the realized relationship matrix.
B. Hayes (2009)
Accuracy of genomic breeding values in multi-breed dairy cattle populations
B. Hayes (2009)
Genes| Genomes| Genetics
T. Thornton (2012)
Genetic Linkage Mapping and Transmission Ratio Distortion in a Three-Generation Four-Founder Population of Panicum virgatum (L.)
Guifen Li (2014)
Invited review: Genomic selection in dairy cattle: progress and challenges.
B. Hayes (2009)
Expected influence of linkage disequilibrium on genetic variance caused by dominance and epistasis on quantitative traits.
W. Hill (2015)
Generalized Thresholding of Large Covariance Matrices
Adam J. Rothman (2009)
Efficient Use of Historical Data for Genomic Selection: A Case Study of Stem Rust Resistance in Wheat
J. Rutkoski (2015)
Pocketbook of Mathematical Functions (Verlag
M. Abramowitz (1984)
Complete Switchgrass Genetic Maps Reveal Subgenome Collinearity, Preferential Pairing and Multilocus Interactions
M. Okada (2010)
Genomic predictions across Nordic Holstein and Nordic Red using the genomic best linear unbiased prediction model with different genomic relationship matrices.
L. Zhou (2014)
The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values
D. Habier (2007)
Prospects of Genomic Prediction in the USDA Soybean Germplasm Collection: Historical Data Creates Robust Models for Enhancing Selection of Accessions
D. Jarquin (2016)
Adding Genetically Distant Individuals to Training Populations Reduces Genomic Prediction Accuracy in Barley
A. Lorenz (2015)
Shrinkage estimation of the genomic relationship matrix can improve genomic estimated breeding values in the training set
Dominik Müller (2015)
Efficient methods to compute genomic predictions.
P. VanRaden (2008)
Assessing Genomic Selection Prediction Accuracy in a Dynamic Barley Breeding Population
Ahmad H. Sallam (2014)
Inference for the Generalization Error
C. Nadeau (2004)
Breeding for Quantitative Traits in Plants
J. Udall (2003)
Switchgrass breeding, genetics, and genomics, pp
M. D. Casler (2012)
cattle in a multi-breed context. Livestock
A. Mäki-Tanila (2014)
Switchgrass Breeding, Genetics, and Genomics
M. Casler (2012)
Diversity and population structure of northern switchgrass as revealed through exome capture sequencing.
J. Evans (2015)
Variance Components
D. Glaser (2003)
Shrinkage Estimation of the Realized Relationship Matrix
Jeffrey B. Endelman (2012)
The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes
S. A. Clark (2011)
Sparse inverse covariance estimation with the graphical lasso.
J. Friedman (2008)
Genomic evaluation of cattle in a multi-breed context ☆
M. Lund (2014)
Model-free Estimation of Recent Genetic Relatedness.
M. P. Conomos (2016)
A reaction norm model for genomic selection using high-dimensional genomic and environmental data
Diego Jarquín (2013)
Graph drawing by force‐directed placement
Thomas M. J. Fruchterman (1991)
Precision and information in linear models of genetic evaluation
D. Laloë (1993)
The Role of Geography in Human Adaptation
G. Coop (2009)
Switchgrass Genomic Diversity, Ploidy, and Evolution: Novel Insights from a Network-Based SNP Discovery Protocol
F. Lu (2013)
Prediction of total genetic value using genome-wide dense marker maps.
T. Meuwissen (2001)
Additive Genetic Variability and the Bayesian Alphabet
D. Gianola (2009)
Adaptive Thresholding for Sparse Covariance Matrix Estimation
T. Cai (2011)
An alternative covariance estimator to investigate genetic heterogeneity in populations
Nicolas Heslot (2015)
Training set optimization under population structure in genomic selection
Julio Isidro (2014)
genome-wide SNPs. American journal of human genetics
R. Steel (1996)
Applications of linear models in animal breeding
C. Henderson (1984)
Quadratic Programming Feature Selection
Irene Rodríguez-Luján (2010)
Genomic BLUP Decoded: A Look into the Black Box of Genomic Prediction
D. Habier (2013)
Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
Han Liu (2010)
Predicting Genetic Values: A Kernel-Based Best Linear Unbiased Prediction With Genomic Data
Ulrike Ober (2011)
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.
C. Hirsch (2014)
Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits
W. Hill (2008)
Influence of Gene Interaction on Complex Trait Variation with Multilocus Models
A. Mäki-Tanila (2014)
Selection for Biomass Yield in Upland, Lowland, and Hybrid Switchgrass
M. Casler (2014)
Pattern Recognition and Machine Learning
R. Neal (2007)
Reliability of Genomic Predictions Across Multiple Populations
A. P. D. de Roos (2009)
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models
Sofiene Karoui (2012)
Genes mirror geography within Europe
J. Novembre (2008)
Accuracy of pedigree and genomic predictions of carcass and novel meat quality traits in multi-breed sheep data assessed by cross-validation
H. Daetwyler (2012)

Semantic Scholar Logo Some data provided by SemanticScholar