Online citations, reference lists, and bibliographies.
Please confirm you are human
(Sign Up for free to never see this)
← Back to Search

Bayesian Inference Of The Number Of Factors In Gene-expression Analysis: Application To Human Virus Challenge Studies

Bo Chen, M. Chen, John W. Paisley, A. Zaas, C. Woods, G. Ginsburg, A. Hero, J. Lucas, D. Dunson, L. Carin
Published 2009 · Computer Science, Biology, Medicine

Save to my Library
Download PDF
Analyze on Scholarcy
BackgroundNonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis.ResultsTime-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches.ConclusionsApplying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
This paper references
Approximations and consistency of Bayes factors as model dimension grows
J. O. Berger (2003)
1. Bayesian Statistics 4
J. Smith (1993)
Sparse Bayesian Learning and the Relevance Vector Machine
G. House (2001)
Gene Expression Patterns
D. Wilkinson (2002)
Peripheral blood gene expression signatures characterize symptomatic respiratory viral infection,
AK Zaas (2009)
Regression Shrinkage and Selection via the Lasso
R. Tibshirani (1996)
Infinite latent feature models and the Indian buffet process
T. Griffiths (2005)
Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data
N. Hjort (1990)
Bayesian Selection on the Number of Factors in a Factor Analysis Model
S. Lee (2002)
The Infinite Hierarchical Factor Regression Model
P. Rai (2008)
Gene expression signatures diagnose influenza and other symptomatic respiratory viral infections in humans.
A. Zaas (2009)
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.
D. Witten (2009)
Modeling Dyadic Data with Binary Latent Factors
Edward Meeds (2006)
Gene expression patterns in blood leukocytes discriminate patients with acute infections.
O. Ramilo (2007)
A note on choosing the number of factors
S. Press (1999)
Dynamic Latent Trait Models for Multidimensional Longitudinal Data
D. Dunson (2003)
Regularization and variable selection via the elastic net
H. Zou (2005)
BMC Bioinformatics
Nonparametric factor analysis with beta process priors
John W. Paisley (2009)
Variational Inference for the Indian Buffet Process
Finale Doshi-Velez (2009)
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
Carlos M. Carvalho (2008)
Sparse Principal Component Analysis
H. Zou (2006)
H. Lopes (2004)
Introduction to Support Vector Machines
D. Boswell (2002)
Bayesian Compressive Sensing
Shihao Ji (2008)
Hierarchical Beta Processes and the Indian Buffet Process
R. Thibaux (2007)
The Bayesian Lasso
Trevor H Park (2008)
2 Institute for Genome Sciences & Policy
Infinite Sparse Factor Analysis and Infinite Independent Components Analysis
David A. Knowles (2007)
Bayesian Model Selection in Factor Analytic Models
J. Ghosh (2008)
Bayesian factor regression models in the''large p
Matthew West (2003)
Variational algorithms for approximate Bayesian inference
M. Beal (2003)

This paper is referenced by
Discovering Words from Continuous Speech
N I K L A S V A N H A I N (2012)
Adaptive Learning and Unsupervised Clustering of Immune Responses Using Microarray Random Sequence Peptides
Anna Malin (2013)
A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection
S. Fourati (2018)
Bayesian Nonparametric Models for Multiway Data Analysis
Zenglin Xu (2015)
Nonparametric Bayesian Dictionary Learning and Count and Mixture Modeling
M. Zhou (2013)
Word Discovery with Beta Process Factor Analysis
Niklas Vanhainen (2012)
Sparse latent factor models with interactions: Analysis of gene expression data
V. D. Mayrink (2013)
T-Relief : Feature Selection for Temporal High-Dimensional Gene Expression Data
M. Radovic (2016)
Bayesian Dictionary Learning for Single and Coupled Feature Spaces
Li He (2013)
Non-parametric Bayesian dictionary learning for image super resolution
L. He (2011)
Effect of environmental stress on regulation of gene expression in the yeast
E. Gross (2015)
Inference of gene networks associated with the host response to infectious disease
Zhe Gan (2016)
Classifying Temporal microarray Data by Selecting Informative genes
Q. Lou (2013)
Non‐rotational Tucker3 core simplification
M. Bayat (2016)
Max-Margin Discriminant Projection via Data Augmentation
B. Chen (2015)
Méthodes bayésiennes pour l'analyse génétique
Cecile Bazot (2013)
Bayesian Modeling and Computation for Mixed Data
Kai Cui (2012)
Analysis of Temporal High-Dimensional Gene Expression Data for Identifying Informative Biomarker Candidates
Q. Lou (2012)
Dynamic transcriptional signatures and network responses for clinical symptoms in influenza-infected human subjects using systems biology approaches
P. Linel (2014)
Predicting viral infection by selecting informative biomarkers from temporal high-dimensional gene expression data
Q. Lou (2012)
On Approximations of the Beta Process in Latent Feature Models
L. A. Labadi (2014)
Bayesian modeling of temporal properties of infectious disease in a college student population
Zhengming Xing (2014)
A Factor Analysis Approach for Clustering Patient Reported Outcomes.
J. Oh (2016)
Minimum redundancy maximum relevance feature selection approach for temporal gene expression data
M. Radovic (2016)
Modeling Time Series and Sequences: Learning Representations and Making Predictions
Wenzhao Lian (2015)
Synergy-COPD: a systems approach for understanding and managing chronic diseases
D. Gomez-Cabrero (2014)
A Host Transcriptional Signature for Presymptomatic Detection of Infection in Humans Exposed to Influenza H1N1 or H3N2
C. Woods (2013)
LaTeX 2 e guide for authors using the EngC design Subtitle , if you
What was old is new again: using the host response to diagnose infectious disease
E. Ko (2015)
Personalized medicine: progress and promise.
I. Chan (2011)
A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection
S. Fourati (2018)
A Diagnostic Platform Predicts Presymptomatic Exposure to Respiratory Viral Infection.
Monica Zamisch (2016)
See more
Semantic Scholar Logo Some data provided by SemanticScholar