Online citations, reference lists, and bibliographies.
← Back to Search

Extensions Of BLUP Models For Genomic Prediction In Heterogeneous Populations: Application In A Diverse Switchgrass Sample

Guillaume P Ramstein, M. D. Casler
Published 2017 · Biology

Cite This
Download PDF
Analyze on Scholarcy
Share
Genomic prediction is a useful tool to accelerate genetic gain in selection using DNA marker information. However, this technology usually relies on models that are not designed to accommodate population heterogeneity, which results from differences in marker effects across genetic backgrounds. Previous studies have proposed to cope with population heterogeneity using diverse approaches: (i) either ignoring it, therefore relying on the robustness of standard approaches; (ii) reducing it, by selecting homogenous subsets of individuals in the sample; or (iii) modelling it by using interactive models. In this study we assessed all three possible approaches, applying existing and novel procedures for each of them. All procedures developed are based on deterministic optimizations, can account for heteroscedasticity, and are applicable in contexts of admixed populations. In a case study on a diverse switchgrass sample, we compared the procedures to a control where predictions rely on homogeneous subsamples. Ignoring heterogeneity was often not detrimental, and sometimes beneficial, to prediction accuracy, compared to the control. Reducing heterogeneity did not result in further increases in accuracy. However, in scenarios of limited subsample sizes, a novel procedure, which accounted for redundancy within subsamples, outperformed the existing procedure, which only considered relationships to selection candidates. Modelling heterogeneity resulted in substantial increases in accuracy, in the cases where accounting for population heterogeneity yielded a highly significant improvement in fit. Our study exemplifies advantages and limits of the various approaches that are promising in various contexts of population heterogeneity, e.g. prediction based on historical datasets or dynamic breeding.
This paper references
10.1186/s12711-014-0067-3
Comparison of joint versus purebred genomic evaluation in the French multi-breed dairy goat population
C. Carillier (2014)
10.1016/0020-7101(81)90057-X
Principles and procedures of statistics: A biometrical approach
G. J. Mitchell (1981)
10.2135/CROPSCI2005.04-0020
Registration of WS4U and WS8U Switchgrass Germplasms
M. Casler (2006)
10.3168/jds.2011-4338
Reliability of direct genomic values for animals with different relationships within and to the reference population.
M. Pszczola (2012)
10.1214/AOS/1009210544
On the distribution of the largest eigenvalue in principal components analysis
I. Johnstone (2001)
10.1093/bioinformatics/btq526
SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies
E. Martin (2010)
Extended Bayesian Information Criteria for Gaussian Graphical Models
Rina Foygel (2010)
10.1016/0960-8524(95)00176-X
Switchgrass as a sustainable bioenergy crop
M. Sanderson (1996)
agricultural, biological, and environmental statistics
A. De Roos (2009)
10.1186/1471-2105-12-246
Enhancements to the ADMIXTURE algorithm for individual ancestry estimation
D. Alexander (2011)
10.1534/genetics.112.141473
Maximizing the Reliability of Genomic Selection by Optimizing the Calibration Set of Reference Individuals: Comparison of Methods in Two Diverse Groups of Maize Inbreds (Zea mays L.)
R. Rincent (2012)
10.1016/J.FCR.2013.07.020
The use of unbalanced historical data for genomic selection in an international wheat breeding program
J. Dawson (2013)
10.1073/pnas.0909559107
Genome-wide patterns of population structure and admixture in West Africans and African Americans
K. Bryc (2009)
ASReml-R reference manual. The State of Queensland, Department of Primary Industries and Fisheries
D. Butler (2009)
10.1186/1297-9686-36-3-325
Should genetic groups be fitted in BLUP evaluation? Practical answer for the French AI beef sire evaluation
F. Phocas (2003)
10.3168/jds.2011-4254
Novel strategies to minimize progeny inbreeding while maximizing genetic gain using genomic information.
J. Pryce (2012)
10.1111/tpj.12601
Nucleotide polymorphism and copy number variant detection using exome capture and next-generation sequencing in the polyploid grass Panicum virgatum
J. Evans (2014)
10.1007/s13253-015-0222-5
Incorporating Genetic Heterogeneity in Whole-Genome Regressions Using Interactions
G. de los Campos (2015)
10.2527/jas.2014-8836
Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus.
D. Lourenco (2015)
10.1016/j.ajhg.2012.10.010
Improved heritability estimation from genome-wide SNPs.
D. Speed (2012)
10.1111/mec.12845
Population genomic variation reveals roles of history, adaptation and ploidy in switchgrass
Paul P Grabowski (2014)
10.1111/rssb.12016
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
Jianqing Fan (2013)
10.2134/AGRONJ1997.00021962008900050018X
Predicting Developmental Morphology in Switchgrass and Big Bluestem
R. Mitchell (1997)
10.1101/gr.094052.109
Fast model-based estimation of ancestry in unrelated individuals.
D. Alexander (2009)
Mean and range refer to the means yi's as described in Material and Methods. Units for mean and range are centimeter, growing degree days and scores on a 0-10 scale, for
Ithaca
10.3168/jds.2011-5019
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels.
M. Erbe (2012)
10.1016/j.ajhg.2012.05.024
Estimating kinship in admixed populations.
T. Thornton (2012)
10.1002/gepi.20584
Linkage analysis without defined pedigrees
A. Day-Williams (2011)
10.2172/885984
Biomass as Feedstock for A Bioenergy and Bioproducts Industry: The Technical Feasibility of a Billion-Ton Annual Supply
R. D. Perlack (2005)
10.1534/g3.112.004630
Genomic Prediction of Northern Corn Leaf Blight Resistance in Maize with Combined or Separated Training Sets for Heterotic Groups
F. Technow (2013)
genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.)
A. J. Rothman (2010)
10.1016/B978-0-12-385531-2.00002-5
Genomic Selection in Plant Breeding. Knowledge and Prospects.
A. Lorenz (2011)
10.1371/journal.pgen.1003608
Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
G. de los Campos (2013)
10.3168/jds.2008-1514
Invited review: reliability of genomic predictions for North American Holstein bulls.
P. VanRaden (2009)
10.1007/s00122-009-1166-3
Accuracy of genotypic value predictions for marker-based selection in biparental plant populations
Robenzon E. Lorenzana (2009)
2016), in a study where they also recommended using principal components from a subset of unrelated individuals in X. Here, we simply applied PCA on the whole matrix X
Conomos (2016)
Accuracy of genotypic value predictions for markerbased selection in biparental plant populations. Theoretical and applied genetics
R. E. Lorenzana (2009)
10.1534/genetics.115.177394
Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models
C. Lehermeier (2015)
10.1007/s00122-016-2756-5
Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (Secale cereale L.)
Hans-Jürgen Auinger (2016)
10.1186/s12711-015-0116-6
Optimization of genomic selection training populations with a genetic algorithm
D. Akdemir (2015)
10.1371/journal.pone.0112227
Accelerating the Switchgrass (Panicum virgatum L.) Breeding Cycle Using Genomic Selection Approaches
A. Lipka (2014)
10.1214/08-AOS600
Covariance regularization by thresholding
P. Bickel (2008)
10.1534/genetics.111.138149
Computationally Efficient Sibship and Parentage Assignment from Multilocus Marker Data
J. Wang (2012)
10.1371/journal.pgen.1004572
Admixture in Latin America: Geographic Structure, Phenotypic Diversity and Self-Perception of Ancestry Based on 7,342 Individuals
A. Ruiz-Linares (2014)
10.1038/nrg2844
Genomics and the future of conservation genetics
F. Allendorf (2010)
Pattern recognition and machine learning: Springer
C. M. Bishop (2006)
10.1534/g3.115.024950
Accuracy of Genomic Prediction in Switchgrass (Panicum virgatum L.) Improved by Accounting for Linkage Disequilibrium
Guillaume P Ramstein (2016)
10.3835/plantgenome2011.08.0024
Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP
Jeffrey B. Endelman (2011)
10.1017/S0016672308009981
Increased accuracy of artificial selection by using the realized relationship matrix.
B. Hayes (2009)
10.1186/1297-9686-41-51
Accuracy of genomic breeding values in multi-breed dairy cattle populations
B. Hayes (2009)
Genes| Genomes| Genetics
T. Thornton (2012)
10.1534/g3.113.010165
Genetic Linkage Mapping and Transmission Ratio Distortion in a Three-Generation Four-Founder Population of Panicum virgatum (L.)
Guifen Li (2014)
10.3168/jds.2008-1646
Invited review: Genomic selection in dairy cattle: progress and challenges.
B. Hayes (2009)
10.1111/jbg.12140
Expected influence of linkage disequilibrium on genetic variance caused by dominance and epistasis on quantitative traits.
W. Hill (2015)
10.1198/jasa.2009.0101
Generalized Thresholding of Large Covariance Matrices
Adam J. Rothman (2009)
10.3835/plantgenome2014.09.0046
Efficient Use of Historical Data for Genomic Selection: A Case Study of Stem Rust Resistance in Wheat
J. Rutkoski (2015)
Pocketbook of Mathematical Functions (Verlag
M. Abramowitz (1984)
10.1534/genetics.110.113910
Complete Switchgrass Genetic Maps Reveal Subgenome Collinearity, Preferential Pairing and Multilocus Interactions
M. Okada (2010)
10.1111/jbg.12089
Genomic predictions across Nordic Holstein and Nordic Red using the genomic best linear unbiased prediction model with different genomic relationship matrices.
L. Zhou (2014)
10.1534/genetics.107.081190
The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values
D. Habier (2007)
10.1534/g3.116.031443
Prospects of Genomic Prediction in the USDA Soybean Germplasm Collection: Historical Data Creates Robust Models for Enhancing Selection of Accessions
D. Jarquin (2016)
10.2135/CROPSCI2014.12.0827
Adding Genetically Distant Individuals to Training Populations Reduces Genomic Prediction Accuracy in Barley
A. Lorenz (2015)
10.1007/s00122-015-2464-6
Shrinkage estimation of the genomic relationship matrix can improve genomic estimated breeding values in the training set
Dominik Müller (2015)
10.3168/jds.2007-0980
Efficient methods to compute genomic predictions.
P. VanRaden (2008)
10.3835/plantgenome2014.05.0020
Assessing Genomic Selection Prediction Accuracy in a Dynamic Barley Breeding Population
Ahmad H. Sallam (2014)
10.1023/A:1024068626366
Inference for the Generalization Error
C. Nadeau (2004)
10.2135/CROPSCI2003.1578
Breeding for Quantitative Traits in Plants
J. Udall (2003)
Switchgrass breeding, genetics, and genomics, pp
M. D. Casler (2012)
cattle in a multi-breed context. Livestock
A. Mäki-Tanila (2014)
10.1007/978-1-4471-2903-5_2
Switchgrass Breeding, Genetics, and Genomics
M. Casler (2012)
10.1111/tpj.13041
Diversity and population structure of northern switchgrass as revealed through exome capture sequencing.
J. Evans (2015)
10.1002/9780470316856
Variance Components
D. Glaser (2003)
10.1534/g3.112.004259
Shrinkage Estimation of the Realized Relationship Matrix
Jeffrey B. Endelman (2012)
10.1186/1297-9686-44-4
The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes
S. A. Clark (2011)
10.1093/BIOSTATISTICS/KXM045
Sparse inverse covariance estimation with the graphical lasso.
J. Friedman (2008)
10.1016/J.LIVSCI.2014.05.008
Genomic evaluation of cattle in a multi-breed context ☆
M. Lund (2014)
10.1016/j.ajhg.2015.11.022
Model-free Estimation of Recent Genetic Relatedness.
M. P. Conomos (2016)
10.1007/s00122-013-2243-1
A reaction norm model for genomic selection using high-dimensional genomic and environmental data
Diego Jarquín (2013)
10.1002/spe.4380211102
Graph drawing by force‐directed placement
Thomas M. J. Fruchterman (1991)
10.1186/1297-9686-25-6-557
Precision and information in linear models of genetic evaluation
D. Laloë (1993)
10.1371/journal.pgen.1000500
The Role of Geography in Human Adaptation
G. Coop (2009)
10.1371/journal.pgen.1003215
Switchgrass Genomic Diversity, Ploidy, and Evolution: Novel Insights from a Network-Based SNP Discovery Protocol
F. Lu (2013)
Prediction of total genetic value using genome-wide dense marker maps.
T. Meuwissen (2001)
10.1534/genetics.109.103952
Additive Genetic Variability and the Bayesian Alphabet
D. Gianola (2009)
10.1198/jasa.2011.tm10560
Adaptive Thresholding for Sparse Covariance Matrix Estimation
T. Cai (2011)
10.1186/s12711-015-0171-z
An alternative covariance estimator to investigate genetic heterogeneity in populations
Nicolas Heslot (2015)
10.1007/s00122-014-2418-4
Training set optimization under population structure in genomic selection
Julio Isidro (2014)
genome-wide SNPs. American journal of human genetics
R. Steel (1996)
Applications of linear models in animal breeding
C. Henderson (1984)
Quadratic Programming Feature Selection
Irene Rodríguez-Luján (2010)
10.1534/genetics.113.152207
Genomic BLUP Decoded: A Look into the Black Box of Genomic Prediction
D. Habier (2013)
Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
Han Liu (2010)
10.1534/genetics.111.128694
Predicting Genetic Values: A Kernel-Based Best Linear Unbiased Prediction With Genomic Data
Ulrike Ober (2011)
10.1093/bfgp/elt051
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.
C. Hirsch (2014)
10.1371/journal.pgen.1000008
Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits
W. Hill (2008)
10.1534/genetics.114.165282
Influence of Gene Interaction on Complex Trait Variation with Multilocus Models
A. Mäki-Tanila (2014)
10.2135/CROPSCI2013.04.0239
Selection for Biomass Yield in Upland, Lowland, and Hybrid Switchgrass
M. Casler (2014)
10.1198/tech.2007.s518
Pattern Recognition and Machine Learning
R. Neal (2007)
10.1534/genetics.109.104935
Reliability of Genomic Predictions Across Multiple Populations
A. P. D. de Roos (2009)
10.1186/1297-9686-44-39
Joint genomic evaluation of French dairy cattle breeds using multiple-trait models
Sofiene Karoui (2012)
10.1038/nature07331
Genes mirror geography within Europe
J. Novembre (2008)
10.1186/1297-9686-44-33
Accuracy of pedigree and genomic predictions of carcass and novel meat quality traits in multi-breed sheep data assessed by cross-validation
H. Daetwyler (2012)



Semantic Scholar Logo Some data provided by SemanticScholar