Online citations, reference lists, and bibliographies.
Please confirm you are human
(Sign Up for free to never see this)
← Back to Search

Bayesian Inference Of The Number Of Factors In Gene-expression Analysis: Application To Human Virus Challenge Studies

Bo Chen, M. Chen, John W. Paisley, A. Zaas, C. Woods, G. Ginsburg, A. Hero, J. Lucas, D. Dunson, L. Carin
Published 2009 · Computer Science, Biology, Medicine

Save to my Library
Download PDF
Analyze on Scholarcy
Share
BackgroundNonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis.ResultsTime-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches.ConclusionsApplying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
This paper references
10.1016/S0378-3758(02)00336-1
Approximations and consistency of Bayes factors as model dimension grows
J. O. Berger (2003)
10.2307/2982740
1. Bayesian Statistics 4
J. Smith (1993)
Sparse Bayesian Learning and the Relevance Vector Machine
G. House (2001)
10.1016/S0006-8993(02)02676-8
Gene Expression Patterns
D. Wilkinson (2002)
Peripheral blood gene expression signatures characterize symptomatic respiratory viral infection,
AK Zaas (2009)
10.1111/J.2517-6161.1996.TB02080.X
Regression Shrinkage and Selection via the Lasso
R. Tibshirani (1996)
Infinite latent feature models and the Indian buffet process
T. Griffiths (2005)
10.1214/AOS/1176347749
Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data
N. Hjort (1990)
10.2333/BHMK.29.23
Bayesian Selection on the Number of Factors in a Factor Analysis Model
S. Lee (2002)
The Infinite Hierarchical Factor Regression Model
P. Rai (2008)
10.1016/j.chom.2009.07.006
Gene expression signatures diagnose influenza and other symptomatic respiratory viral infections in humans.
A. Zaas (2009)
10.1093/biostatistics/kxp008
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.
D. Witten (2009)
10.7551/mitpress/7503.003.0127
Modeling Dyadic Data with Binary Latent Factors
Edward Meeds (2006)
10.1182/BLOOD-2006-02-002477
Gene expression patterns in blood leukocytes discriminate patients with acute infections.
O. Ramilo (2007)
10.1080/03610929908832378
A note on choosing the number of factors
S. Press (1999)
10.1198/016214503000000387
Dynamic Latent Trait Models for Multidimensional Longitudinal Data
D. Dunson (2003)
10.1111/J.1467-9868.2005.00503.X
Regularization and variable selection via the elastic net
H. Zou (2005)
BMC Bioinformatics
(2007)
10.1145/1553374.1553474
Nonparametric factor analysis with beta process priors
John W. Paisley (2009)
Variational Inference for the Indian Buffet Process
Finale Doshi-Velez (2009)
10.1198/016214508000000869
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
Carlos M. Carvalho (2008)
10.1198/106186006X113430
Sparse Principal Component Analysis
H. Zou (2006)
BAYESIAN MODEL ASSESSMENT IN FACTOR ANALYSIS
H. Lopes (2004)
Introduction to Support Vector Machines
D. Boswell (2002)
10.1109/TSP.2007.914345
Bayesian Compressive Sensing
Shihao Ji (2008)
Hierarchical Beta Processes and the Indian Buffet Process
R. Thibaux (2007)
10.1198/016214508000000337
The Bayesian Lasso
Trevor H Park (2008)
2 Institute for Genome Sciences & Policy
10.1007/978-3-540-74494-8_48
Infinite Sparse Factor Analysis and Infinite Independent Components Analysis
David A. Knowles (2007)
10.1007/978-0-387-76721-5_7
Bayesian Model Selection in Factor Analytic Models
J. Ghosh (2008)
Bayesian factor regression models in the''large p
Matthew West (2003)
Variational algorithms for approximate Bayesian inference
M. Beal (2003)



This paper is referenced by
Discovering Words from Continuous Speech
N I K L A S V A N H A I N (2012)
Adaptive Learning and Unsupervised Clustering of Immune Responses Using Microarray Random Sequence Peptides
Anna Malin (2013)
10.1038/s41467-018-06735-8
A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection
S. Fourati (2018)
10.1109/TPAMI.2013.201
Bayesian Nonparametric Models for Multiway Data Analysis
Zenglin Xu (2015)
Nonparametric Bayesian Dictionary Learning and Count and Mixture Modeling
M. Zhou (2013)
Word Discovery with Beta Process Factor Analysis
Niklas Vanhainen (2012)
10.1214/12-AOAS607
Sparse latent factor models with interactions: Analysis of gene expression data
V. D. Mayrink (2013)
T-Relief : Feature Selection for Temporal High-Dimensional Gene Expression Data
M. Radovic (2016)
Bayesian Dictionary Learning for Single and Coupled Feature Spaces
Li He (2013)
10.1109/FIIW.2011.6476831
Non-parametric Bayesian dictionary learning for image super resolution
L. He (2011)
10.1016/J.PHYSA.2015.02.076
Effect of environmental stress on regulation of gene expression in the yeast
E. Gross (2015)
10.1017/CBO9781316162750.014
Inference of gene networks associated with the host response to infectious disease
Zhe Gan (2016)
10.1142/S0219720013410060
Classifying Temporal microarray Data by Selecting Informative genes
Q. Lou (2013)
10.1002/cem.2798
Non‐rotational Tucker3 core simplification
M. Bayat (2016)
10.1109/TKDE.2015.2397444
Max-Margin Discriminant Projection via Data Augmentation
B. Chen (2015)
Méthodes bayésiennes pour l'analyse génétique
Cecile Bazot (2013)
Bayesian Modeling and Computation for Mixed Data
Kai Cui (2012)
10.1109/ICDM.2012.92
Analysis of Temporal High-Dimensional Gene Expression Data for Identifying Informative Biomarker Candidates
Q. Lou (2012)
10.1007/s10928-014-9365-1
Dynamic transcriptional signatures and network responses for clinical symptoms in influenza-infected human subjects using systems biology approaches
P. Linel (2014)
10.1109/BIBM.2012.6392631
Predicting viral infection by selecting informative biomarkers from temporal high-dimensional gene expression data
Q. Lou (2012)
10.1007/S13171-017-0103-9
On Approximations of the Beta Process in Latent Feature Models
L. A. Labadi (2014)
10.1080/02664763.2013.870138
Bayesian modeling of temporal properties of infectious disease in a college student population
Zhengming Xing (2014)
10.3414/ME16-01-0035
A Factor Analysis Approach for Clustering Patient Reported Outcomes.
J. Oh (2016)
10.1186/s12859-016-1423-9
Minimum redundancy maximum relevance feature selection approach for temporal gene expression data
M. Radovic (2016)
Modeling Time Series and Sequences: Learning Representations and Making Predictions
Wenzhao Lian (2015)
10.1186/1479-5876-12-S2-S2
Synergy-COPD: a systems approach for understanding and managing chronic diseases
D. Gomez-Cabrero (2014)
10.1371/journal.pone.0052198
A Host Transcriptional Signature for Presymptomatic Detection of Infection in Humans Exposed to Influenza H1N1 or H3N2
C. Woods (2013)
LaTeX 2 e guide for authors using the EngC design Subtitle , if you
ALI WOOLLATT (2014)
10.1586/14737159.2015.1059278
What was old is new again: using the host response to diagnose infectious disease
E. Ko (2015)
10.1146/annurev-genom-082410-101446
Personalized medicine: progress and promise.
I. Chan (2011)
10.1101/311696
A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection
S. Fourati (2018)
10.7205/MILMED-D-15-00481
A Diagnostic Platform Predicts Presymptomatic Exposure to Respiratory Viral Infection.
Monica Zamisch (2016)
See more
Semantic Scholar Logo Some data provided by SemanticScholar