next up previous contents
: 2 New to this : SAM (Sequence Alignment and : 目次   目次



1 Introduction

The Sequence Alignment and Modeling system (SAM) is a collection of software tools for creating, refining, and using a type of statistical model called a linear hidden Markov model for biological sequence analysis. Linear hidden Markov models only model primary structure (sequence) information; long-range iterations, such as base pairing in RNA, require more complex models such as stochastic context-free grammars, as described by Sakakibara et. al (NAR 22(23):5112-5120), also available from the UCSC computational biology WWW site.

The algorithms and methods have been described in several papers, some of which are available on our WWW site,

http://www.cse.ucsc.edu/research/compbio/sam.html.

A tutorial on the use of SAM and the iterative SAM-T98 method (the direct predecessor of SAM-T99) is also available at our WWW site,

http://www.cse.ucsc.edu/research/compbio/ismb99.tutorial.html.

SAM-T2K is used at the official SCOP Superfamily server at http://stash.mrc-lmb.cam.ac.uk/SUPERFAMILY

The primary papers from UCSC (copies of these papers and several others are available from the SAM WWW site) include:

We would appreciate references to the first article in all work that cites or uses the SAM system, the second for all work that cites or uses the SAM-T2K method, and the third article in work that cites or uses HMM methods similar to SAM.

Because the software is an active research tool, there are a vast selection of options, many of which have, through experimental study, been set to reasonable defaults.

The SAM software and documentation copyright is held by the Regents of the University of California. A signed license is required to obtain a copy of SAM, downloadable from the SAM WWW site, with no fee for educational research use. If you have suggestions for enhancements, new ways of using SAM, or other comments, please contact us.

SAM incorporates the readseq package by D. G. Gilbert, who allows it to be freely copied and used. The hmmedit and sae programs use ACEdb by Richard Durbin and Jean Thierry-Mieg. The source code for hmmedit and sae is available from ftp://ftp.cse.ucsc.edu/pub/protein/hmmeditsaesrc.tar.Z.

SAM includes the BLAST matrix library for use with SAM's Smith and Waterman implementation. This work of the U. S. Government is available at http://www.ncbi.nlm.nig.gov/BLAST

To be informed of future releases, please send your e-mail address to sam-info@cse.ucsc.edu for addition to our mailing list. Please also use this address for any questions or comments you may have.

You will also find Sean Eddy's system, HMMER , to be of interest.

Martin Madera and Julian Gough have written a perl converter between SAM and HMMer 2.0 formats. (The SAM programs only work with HMMer 1.7.)

http://www.mrc-lmb.cam.ac.uk/genomes/julian/convert/convert.html

1.1 Acknowledgments

We thank I. Saira Mian and Finn Drablos for their important early evaluations of the system. Finally, we thank the entire UCSC Computational Biology Group (now forming the core of a Center for Biomolecular Engineering), led by David Haussler, who got this whole thing started. This work was supported in part by NSF grants CDA-9115268, IRI-9123692, BIR 94-08579, MIP-9423985, DBI-9808007, and EIA-9905322; DOE grants 94-12-048216 and DE-FG03-99ER62849; ONR grant N00014-91-J-1162; NIH grant GM17129; a grant from the Danish Natural Science Research Council; and a gift from Digital Equipment Corporation.


next up previous contents
: 2 New to this : SAM (Sequence Alignment and : 目次   目次
SAM
sam-info@cse.ucsc.edu
UCSC Computational Biology Group