Online citations, reference lists, and bibliographies.

A Tutorial On Support Vector Machine-based Methods For Classification Problems In Chemometrics.

Jan Luts, Fabian Ojeda, Raf Van de Plas, Bart De Moor, Sabine Van Huffel, Johan A. K. Suykens
Published 2010 · Chemistry, Medicine
Cite This
Download PDF
Analyze on Scholarcy
Share
This tutorial provides a concise overview of support vector machines and different closely related techniques for pattern classification. The tutorial starts with the formulation of support vector machines for classification. The method of least squares support vector machines is explained. Approaches to retrieve a probabilistic interpretation are covered and it is explained how the binary classification techniques can be extended to multi-class methods. Kernel logistic regression, which is closely related to iteratively weighted least squares support vector machines, is discussed. Different practical aspects of these methods are addressed: the issue of feature selection, parameter tuning, unbalanced data sets, model evaluation and statistical comparison. The different concepts are illustrated on three real-life applications in the field of metabolomics, genetics and proteomics.
This paper references
10.5555/299094
Advances in kernel methods: support vector learning
Bernhard Schölkopf (1999)
10.18637/jss.v011.i09
kernlab - An S4 package for kernel methods in R
Alexandros Karatzoglou (2004)
Advances in Large Margin Classifiers
Alexander J. Smola (2000)
10.1038/86573
Imaging mass spectrometry: A new technology for the analysis of protein expression in mammalian tissues
Markus Stoeckli (2001)
Spatial Querying of Imaging Mass Spectrometry Data: a nonnegative least squares approach
Raf Van de Plas (2007)
10.1023/A:1013637720281
On the Learnability and Design of Output Codes for Multiclass Problems
Koby Crammer (2000)
10.1073/pnas.0808709105
A general framework for multiple testing dependence
Jeffrey T. Leek (2008)
Statistical Comparisons of Classifiers over Multiple Data Sets
Janez Demsar (2006)
10.1186/gb-2006-7-5-r37
Inferring transcriptional modules from ChIP-chip, motif and microarray data
Karen Lemmens (2005)
10.1148/radiology.143.1.7063747
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
James A Hanley (1982)
10.7551/mitpress/3206.001.0001
Gaussian Processes for Machine Learning
Carl Edward Rasmussen (2005)
10.1007/978-3-540-24775-3_3
Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms
Remco R. Bouckaert (2004)
10.1136/bmj.322.7280.226
Sifting the evidence—what's wrong with significance tests?
Jonathan A C Sterne (2001)
10.1109/IJCNN.2005.1556089
A combined SVM and LDA approach for classification
Tao Xiong (2005)
10.1023/A:1010920819831
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems
David J. Hand (2004)
10.1186/1471-2105-5-17
VisANT: an online visualization and analysis tool for biological interaction data
Zhenjun Hu (2003)
10.1023/A:1018628609742
Least Squares Support Vector Machine Classifiers
Johan A. K. Suykens (2004)
10.1109/tnn.2005.848998
Learning with Kernels: support vector machines, regularization, optimization, and beyond
Bernhard Schölkopf (2002)
10.1023/A:1012489924661
Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities
Peter Sollich (2004)
10.1037/0003-066X.49.12.997
The earth is round (p < .05)
Jacob Cohen (1994)
Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods
John Platt (1999)
10.1016/j.neunet.2004.07.002
Fast exact leave-one-out cross-validation of sparse least-squares support vector machines
Gavin C. Cawley (2004)
10.1198/106186005X25619
Kernel Logistic Regression and the Import Vector Machine
Ji Zhu (2001)
Choosing Between Two Learning Algorithms Based on Calibrated Tests
Remco R. Bouckaert (2003)
10.1007/s10994-007-5018-6
A note on Platt’s probabilistic outputs for support vector machines
Hsuan-Tien Lin (2007)
10.1002/nbm.1347
Nosologic imaging of the brain: segmentation and classification using MRI and MRSI.
Jan Luts (2009)
10.1023/A:1006593614256
A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms
Dietrich Wettschereck (1997)
Pairwise classification and support vector machines
Ulrich Kressel (1999)
10.1073/pnas.97.1.262
Knowledge-based analysis of microarray gene expression data by using support vector machines.
Matt Brown (2000)
10.1142/9789812772435_0043
Prospective Exploration of Biochemical Tissue Composition via Imaging Mass Spectrometry Guided by Principal Component Analysis
Raf Van de Plas (2007)
10.1111/j.2517-6161.1996.tb02080.x
Regression Shrinkage and Selection via the Lasso
Robert Tibshirani (1996)
10.1016/j.neunet.2007.12.053
Low rank updated LS-SVM classifiers for fast variable selection
Fabian Ojeda (2008)
10.1016/S0004-3702(97)00043-X
Wrappers for Feature Subset Selection
Ron Kohavi (1997)
10.1162/089976602753633411
Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis
Tony Van Gestel (2002)
10.1016/b978-1-55860-335-6.50023-4
Irrelevant Features and the Subset Selection Problem
George H. John (1994)
10.1002/mas.20124
Imaging mass spectrometry.
Liam A McDonnell (2007)
10.1117/1.2819119
Pattern Recognition and Machine Learning
Christopher M. Bishop (2007)
10.1109/TPAMI.2006.61
Ordering and finding the best of K > 2 supervised learning algorithms
Olcay Taner Yildiz (2006)
10.1109/LSSA.2007.4400921
Imaging mass spectrometry based exploration of biochemical tissue composition using peak intensity weighted PCA
R. Van de Plas (2007)
10.1093/bioinformatics/btn273
SIRENE: supervised inference of regulatory networks
Fantine Mordelet (2008)
Use of the Zero-Norm with Linear Models and Kernel Methods
Jason Weston (2003)
Pairwise Neural Network Classifiers with Probabilistic Outputs
David Price (1994)
10.1007/978-1-4615-0907-3
Learning to classify text using support vector machines - methods, theory and algorithms
Thorsten Joachims (2002)
10.1017/CBO9780511801389.012
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
Nello Cristianini (2000)
10.1162/089976600300015042
Bounds on Error Expectation for Support Vector Machines
Vladimir Vapnik (2000)
10.1109/IJCNN.1999.831072
Multiclass least squares support vector machines
Johan A. K. Suykens (1999)
10.1145/775047.775151
Transforming classifier scores into accurate multiclass probability estimates
Bianca Zadrozny (2002)
10.1145/1961189.1961199
LIBSVM: A library for support vector machines
Chih-Chung Chang (2011)
10.1142/5089
Least Squares Support Vector Machines
Johan A. K. Suykens (2002)
10.1091/mbc.11.12.4241
Genomic expression programs in the response of yeast cells to environmental changes.
Audrey P. Gasch (2000)
10.1007/s11693-006-9003-3
Machine learning for regulatory analysis and transcription factor target prediction in yeast
Dustin T. Holloway (2006)
An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons
Salvador García (2008)
10.1162/089976699300016007
Combined 5 2 cv F Test for Comparing Supervised Classification Learning Algorithms
Ethem Alpaydin (1999)
10.2307/1271368
Statistical learning theory
Vladimir Vapnik (1998)
10.1016/B978-0-444-89178-5.50005-1
PROBABILISTIC APPROACH FOR MULTICLASS CLASSIFICATION WITH NEURAL NETWORKS
Philippe Réfrégier (1991)
10.1109/TPAMI.2007.1068
Twin Support Vector Machines for Pattern Classification
Jayadeva (2007)
10.1023/A:1024068626366
Inference for the Generalization Error
Claude Nadeau (1999)
The Entire Regularization Path for the Support Vector Machine
Trevor J. Hastie (2004)
10.1023/A:1022627411411
Support-Vector Networks
Corinna Cortes (2004)
10.1023/A:1012487302797
Gene Selection for Cancer Classification using Support Vector Machines
Isabelle Guyon (2002)
10.1021/pr900253y
Toward digital staining using imaging mass spectrometry and random forests.
Michael Hanselmann (2009)
10.1007/978-1-84996-098-4
Support Vector Machines for Pattern Classification
Shigeo Abe (2010)
10.1093/bioinformatics/btg347
Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data
Jiang Qian (2003)
10.1016/j.patcog.2006.12.015
Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression
Senjian An (2007)
Introduction to machine learning
Ethem Alpaydin (2004)
10.1109/IJCNN.2005.1556372
Fast Bayesian support vector machine parameter tuning with the Nystrom method
Carl Gold (2005)
10.1162/089976698300017197
Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms
Thomas G. Dietterich (1996)
10.1613/jair.105
Solving Multiclass Learning Problems via Error-Correcting Output Codes
Thomas G. Dietterich (1995)
10.1007/s10994-005-0768-5
A Fast Dual Algorithm for Kernel Logistic Regression
S. Sathiya Keerthi (2005)
10.1023/A:1012450327387
Choosing Multiple Parameters for Support Vector Machines
Olivier Chapelle (2002)
10.1090/dimacs/055/01
Breast cancer survival and chemotherapy: A support vector machine analysis
Yuh-Jye Lee (1999)
Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters
Gavin C. Cawley (2007)
10.1109/IJCNN.2007.4371223
Multi-class kernel logistic regression: a fixed-size implementation
Peter Karsmakers (2007)
10.1007/978-1-4757-3264-1
The Nature of Statistical Learning Theory
Vladimir Naumovich Vapnik (2000)
Feature Selection for SVMs
Jason Weston (2000)
10.1162/15324430152733133
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
Erin L. Allwein (2000)
Probability Estimates for Multi-class Classification by Pairwise Coupling
Ting-Fan Wu (2003)
10.1016/S0004-3702(97)00063-5
Selection of Relevant Features and Examples in Machine Learning
Avrim Blum (1997)
10.1109/IJCNN.2006.246634
Leave-One-Out Cross-Validation Based Model Selection Criteria for Weighted LS-SVMs
Gavin C. Cawley (2006)
10.1214/aos/1028144844
Classification by Pairwise Coupling
Trevor J. Hastie (1997)
A Tutorial on Support Vector Machines for Pattern Recognition
J C BurgesChristopher (1998)
10.1016/c2009-0-27845-7
Essentials of Artificial Intelligence
Matthew L. Ginsberg (1993)
10.1016/j.artmed.2004.01.001
Brain tumor classification based on long echo proton MRS signals
Lukas Lukas (2004)
10.1145/130385.130401
A training algorithm for optimal margin classifiers
Bernhard E. Boser (1992)
10.1088/0954-898X_6_3_011
Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks
David MacKay (1995)
10.1038/nature02800
Transcriptional regulatory code of a eukaryotic genome
Christopher T. Harbison (2004)
10.2307/2984653
Applied Linear Statistical Models
John Neter (1974)



This paper is referenced by
10.1016/J.BSPC.2018.10.017
Alcohol use disorder detection using EEG Signal features and flexible analytical wavelet transform
Arti Anuragi (2019)
10.1016/J.JNGSE.2016.04.055
Comparison of supervised and unsupervised approaches for mudstone lithofacies classification: Case studies from the Bakken and Mahantango-Marcellus Shale, USA
Shuvajit Bhattacharya (2016)
10.1016/j.aca.2011.02.053
Detection of antibiotic residues in bovine milk by a voltammetric electronic tongue system.
Zhenbo Wei (2011)
10.1016/j.meatsci.2011.07.025
Prediction of total viable counts on chilled pork using an electronic nose combined with support vector machine.
Danfeng Wang (2012)
10.1016/j.neucom.2012.12.049
Risk group detection and survival function estimation for interval coded survival methods
Vanya Van Belle (2013)
10.1016/J.EAEF.2015.06.004
Classification of fresh and spoiled Japanese dace (Tribolodon hakonensis) fish using ultraviolet–visible spectra of eye fluid with multivariate analysis
Anisur Rahman (2016)
Multiclass Brain Tumor Classification using SVM
A. N. Pathak (2014)
10.2174/157340911798260269
Review of computer-aided models for predicting collagen stability.
Riccardo Concu (2011)
10.1016/j.aquatox.2016.11.018
Biomarker analysis of American toad (Anaxyrus americanus) and grey tree frog (Hyla versicolor) tadpoles following exposure to atrazine
Marcía N. Snyder (2017)
10.1016/J.FORC.2018.08.004
Toward the identification of marijuana varieties by headspace chemical forensics
Austin Mcdaniel (2018)
10.1016/J.CHEMOLAB.2012.03.013
Screening oil spills by mid-IR spectroscopy and supervised pattern recognition techniques
M. P. Gómez-Carracedo (2012)
10.5121/ijcsea.2012.2603
An Approach for Classification of Dysfluent and Fluent Speech Using K-NN And SVM
P. Mahesha (2013)
10.1016/J.CSR.2014.05.004
Mapping seabed sediments: Comparison of manual, geostatistical, object-based image analysis and machine learning approaches
Markus Diesing (2014)
10.1109/SCIS-ISIS.2018.00193
Gait Classification of Healthy Young and Elderly Adults Using Micro-Doppler Radar Remote Sensing
Hiroaki Okinaka (2018)
10.1109/ACCESS.2019.2953040
A Novel Model for Sex Discrimination of Silkworm Pupae From Different Species
Dan Tao (2019)
10.1109/JSTARS.2014.2374175
A Comparison of Small-Area Population Estimation Techniques Using Built-Area and Height Data, Riyadh, Saudi Arabia
Mohammed Alahmadi (2016)
10.1016/J.CHERD.2019.02.003
New deterministic tools to systematically investigate fouling occurrence in membrane bioreactors
Hamideh Hamedi (2019)
10.1371/journal.pone.0024973
A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches
Dayle L. Sampson (2011)
10.1002/(issn)1098-2787
Mass Spectrometry Reviews
Nico Verbeeck (2020)
10.1016/j.compenvurbsys.2013.06.002
Estimating the spatial distribution of the population of Riyadh, Saudi Arabia using remotely sensed built land cover and height data
Mohammed Alahmadi (2013)
10.1109/ACPR.2013.115
New Banknote Number Recognition Algorithm Based on Support Vector Machine
Shan Gai (2013)
10.1007/s11306-015-0894-4
Novel application of heuristic optimisation enables the creation and thorough evaluation of robust support vector machine ensembles for machine learning applications
Eleni Anthippi Chatzimichali (2015)
10.20307/NPS.2018.24.3.164
Classficiation of Bupleuri Radix according to Geographical Origins using Near Infrared Spectroscopy (NIRS) Combined with Supervised Pattern Recognition
Dong Young Lee (2018)
10.1214/09-SS052
Primal and dual model representations in kernel-based learning
Johan A. K. Suykens (2010)
10.1039/C3AY40379C
Interpretation of type 2 diabetes mellitus relevant GC-MS metabolomics fingerprints by using random forests
Jian-hua Huang (2013)
10.1093/bib/bbt057
Bioinformatics approaches for improved recombinant protein production in Escherichia coli: protein solubility prediction
Catherine Ching Han Chang (2014)
10.1016/J.CHEMOLAB.2017.09.001
Comparing unfolded and two-dimensional discriminant analysis and support vector machines for classification of EEM data
Camilo L M Morais (2017)
10.1007/978-3-319-92007-8_22
Entropy-Assisted Emotion Recognition of Valence and Arousal Using XGBoost Classifier
Sheng-Hui Wang (2018)
10.1007/S00170-019-03858-0
Vibration signal analysis using symbolic dynamics for gearbox fault diagnosis
Rubén Medina (2019)
10.1016/j.watres.2020.115799
A knowledge discovery framework to predict the N2O emissions in the wastewater sector.
Vasileia Vasilaki (2020)
10.1016/j.bcp.2013.08.026
Biomarkers in pharmacology and drug discovery.
Dale Compton Anderson (2014)
10.1016/j.biotechadv.2014.11.008
WITHDRAWN: Recent advances in chemometric methods for plant metabolomics: A review.
Lun-zhao Yi (2014)
See more
Semantic Scholar Logo Some data provided by SemanticScholar