Online citations, reference lists, and bibliographies.
← Back to Search

Supervised Harvesting Of Expression Trees

T. Hastie, R. Tibshirani, D. Botstein, P. Brown
Published 2000 · Biology, Medicine

Save to my Library
Download PDF
Analyze on Scholarcy
Share
BackgroundWe propose a new method for supervised learning from gene expression data. We call it 'tree harvesting'. This technique starts with a hierarchical clustering of genes, then models the outcome variable as a sum of the average expression profiles of chosen clusters and their products. It can be applied to many different kinds of outcome measures such as censored survival times, or a response falling in two or more classes (for example, cancer classes). The method can discover genes that have strong effects on their own, and genes that interact with other genes.ResultsWe illustrate the method on data from a lymphoma study, and on a dataset containing samples from eight different cancers. It identified some potentially interesting gene clusters. In simulation studies we found that the procedure may require a large number of experimental samples to successfully discover interactions.ConclusionsTree harvesting is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worthy of further investigation.
This paper references



This paper is referenced by
10.1093/bioinformatics/18.11.1446
Gene expression data analysis with a dynamically extended self-organized map that exploits class information
S. Mavroudi (2002)
10.1517/14622416.3.1.31
Functional genomics approaches to understanding brain disorders.
P. Shilling (2002)
10.1109/TNNLS.2012.2200262
Efficient Sparse Modeling With Automatic Feature Grouping
Leon Wenliang Zhong (2012)
10.1080/15533174.2012.707849
Consistent Group Identification and Variable Selection in Regression With Correlated Predictors
D. B. Sharma (2013)
10.32614/RJ-2016-057
Weighted Distance Based Discriminant Analysis: The R Package WeDiBaDis
I. Irigoien (2016)
Bayesian Sparse Learning for High Dimensional Data
Minghui Shi (2011)
10.1093/bioinformatics/bth176
A comparison of cluster analysis methods using DNA methylation data
K. Siegmund (2004)
10.1504/IJDMB.2017.084027
A novel point density based validity index for clustering gene expression datasets
M. A. Wani (2017)
A graph laplacian prior for variable selection and grouping
F. Liu (2011)
10.1007/978-3-319-05630-2_1
Introduction to Pattern Recognition and Bioinformatics
P. Maji (2014)
10.1093/BIOSTATISTICS/KXL002
Averaged gene expressions for regression.
M. Y. Park (2007)
Histogram based Hierarchical Data Representation for Microarray Classification
Philippe Jean Salembier Clairon (2012)
10.1093/BIOMET/ASN007
Hierarchical testing of variable importance
Nicolai Meinshausen (2008)
10.1093/bioinformatics/btq261
Structure-based variable selection for survival data
V. Lagani (2010)
10.1007/978-3-319-97307-4_1
Neuroimaging: Diagnostic Boundaries and Biomarkers
S. Galderisi (2019)
10.1002/sam.11363
A nonparametric test of independence between 2 variables
B. Li (2017)
10.1089/106652703322756177
Regression Approaches for Microarray Data Analysis
M. Segal (2003)
An Empirical Study of Stability of Feature Selection Algorithms
L. Yu (2008)
Classification of High-throughput Data Using Correlation-shared Gene Clusters
Pingzhao Hu (2011)
10.1186/gb-2002-3-4-research0019
Permutation-validated principal components analysis of microarray data
J. Landgrebe (2001)
10.1214/08-AOAS137B
Discussion of: TreeletsAn adaptive multi-scale basis for sparse unordered data
P. Bickel (2008)
10.1016/S1359-6446(05)03433-1
Bioinformatic methods for integrating whole-genome expression results into cellular networks.
D. Cavalieri (2005)
10.1186/1748-7188-5-30
Survival associated pathway identification with group Lp penalized global AUC maximization
Z. Liu (2010)
Microarray Gene Expression Data with Linked Survival Phenotypes: Diffuse Large-B-Cell Lymphoma Revisited
Uc San Francisco (2005)
10.2202/1544-6115.1550
Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data
Z. Wang (2010)
10.1504/IJDMB.2017.10004930
A novel point density based validity index for clustering gene expression datasets
M. A. Wani (2017)
10.1093/bioinformatics/18.1.205
CIT: identification of differentially expressed clusters of genes from microarray data
D. Rhodes (2002)
Molecular Signatures from Gene Expression Data
Ramón Díaz-Uriarte (2004)
10.1109/ICNC.2014.6975929
Stable feature selection with ensembles of multi-reliefF
Qifeng Zhou (2014)
10.1111/J.1541-0420.2005.00405.X
Additive risk models for survival data with high-dimensional covariates.
S. Ma (2006)
10.1142/9789812564498_0008
Interrelated Clustering: An Approach for Gene Expression Data Analysis
C. Tang (2003)
Bias and benefit induced by intra-species homologies in guilt by association methods to predict protein function
L. Bréhélin (2006)
See more
Semantic Scholar Logo Some data provided by SemanticScholar