Next: Introduction
<609>>[1]
Support Vector Machine Classification of
Microarray Gene Expression Data
UCSC-CRL-99-09
Michael P. S. Brown
William Noble Grundy
1
David Lin
Nello Cristianini
2
Charles Sugnet
Manuel Ares, Jr.
David Haussler
Department of Computer Science
University of California, Santa Cruz
Santa Cruz, CA 95065
{mpbrown,bgrundy,dave,haussler}@cse.ucsc.edu
Center for Molecular Biology of RNA
Department of Biology
University of California, Santa Cruz
Santa Cruz, CA 95065
Department of Engineering Mathematics
University of Bristol
Bristol, UK
June 12, 1999
Abstract:
We introduce a new method of functionally classifying genes using gene
expression data from DNA microarray hybridization experiments. The
method is based on the theory of support vector machines (SVMs). We
describe SVMs that use different similarity metrics including a simple
dot product of gene expression vectors, polynomial versions of the dot
product, and a radial basis function. Compared to the other SVM
similarity metrics, the radial basis function SVM appears to provide
superior performance in identifying sets of genes with a common
function using expression data. In addition, SVM performance is
compared to four standard machine learning algorithms. SVMs have many
features that make them attractive for gene expression analysis,
including their flexibility in choosing a similarity function,
sparseness of solution when dealing with large data sets, the ability
to handle large feature spaces, and the ability to identify outliers.
Keywords: Gene Microarrays, Gene Expression, Support Vector
Machines, Pattern Classification, Functional Gene Annotation
Running head: SVM Classification of Gene Expression Data
Next: Introduction
Michael Brown
1999-11-05