Online citations, reference lists, and bibliographies.
← Back to Search

Supervised Harvesting Of Expression Trees

T. Hastie, R. Tibshirani, D. Botstein, P. Brown
Published 2000 · Biology, Medicine

Save to my Library
Download PDF
Analyze on Scholarcy
BackgroundWe propose a new method for supervised learning from gene expression data. We call it 'tree harvesting'. This technique starts with a hierarchical clustering of genes, then models the outcome variable as a sum of the average expression profiles of chosen clusters and their products. It can be applied to many different kinds of outcome measures such as censored survival times, or a response falling in two or more classes (for example, cancer classes). The method can discover genes that have strong effects on their own, and genes that interact with other genes.ResultsWe illustrate the method on data from a lymphoma study, and on a dataset containing samples from eight different cancers. It identified some potentially interesting gene clusters. In simulation studies we found that the procedure may require a large number of experimental samples to successfully discover interactions.ConclusionsTree harvesting is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worthy of further investigation.
This paper references

This paper is referenced by
Gene expression data analysis with a dynamically extended self-organized map that exploits class information
S. Mavroudi (2002)
Functional genomics approaches to understanding brain disorders.
P. Shilling (2002)
Efficient Sparse Modeling With Automatic Feature Grouping
Leon Wenliang Zhong (2012)
Consistent Group Identification and Variable Selection in Regression With Correlated Predictors
D. B. Sharma (2013)
Weighted Distance Based Discriminant Analysis: The R Package WeDiBaDis
I. Irigoien (2016)
Bayesian Sparse Learning for High Dimensional Data
Minghui Shi (2011)
A comparison of cluster analysis methods using DNA methylation data
K. Siegmund (2004)
A novel point density based validity index for clustering gene expression datasets
M. A. Wani (2017)
A graph laplacian prior for variable selection and grouping
F. Liu (2011)
Introduction to Pattern Recognition and Bioinformatics
P. Maji (2014)
Averaged gene expressions for regression.
M. Y. Park (2007)
Histogram based Hierarchical Data Representation for Microarray Classification
Philippe Jean Salembier Clairon (2012)
Hierarchical testing of variable importance
Nicolai Meinshausen (2008)
Structure-based variable selection for survival data
V. Lagani (2010)
Neuroimaging: Diagnostic Boundaries and Biomarkers
S. Galderisi (2019)
A nonparametric test of independence between 2 variables
B. Li (2017)
Regression Approaches for Microarray Data Analysis
M. Segal (2003)
An Empirical Study of Stability of Feature Selection Algorithms
L. Yu (2008)
Classification of High-throughput Data Using Correlation-shared Gene Clusters
Pingzhao Hu (2011)
Permutation-validated principal components analysis of microarray data
J. Landgrebe (2001)
Discussion of: TreeletsAn adaptive multi-scale basis for sparse unordered data
P. Bickel (2008)
Bioinformatic methods for integrating whole-genome expression results into cellular networks.
D. Cavalieri (2005)
Survival associated pathway identification with group Lp penalized global AUC maximization
Z. Liu (2010)
Microarray Gene Expression Data with Linked Survival Phenotypes: Diffuse Large-B-Cell Lymphoma Revisited
Uc San Francisco (2005)
Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data
Z. Wang (2010)
A novel point density based validity index for clustering gene expression datasets
M. A. Wani (2017)
CIT: identification of differentially expressed clusters of genes from microarray data
D. Rhodes (2002)
Molecular Signatures from Gene Expression Data
Ramón Díaz-Uriarte (2004)
Stable feature selection with ensembles of multi-reliefF
Qifeng Zhou (2014)
Additive risk models for survival data with high-dimensional covariates.
S. Ma (2006)
Interrelated Clustering: An Approach for Gene Expression Data Analysis
C. Tang (2003)
Bias and benefit induced by intra-species homologies in guilt by association methods to predict protein function
L. Bréhélin (2006)
See more
Semantic Scholar Logo Some data provided by SemanticScholar