CMPE 280B Spring 1999 Home Page

Bioinformatics Research Seminar


This course is a weekly research seminar that assumes that students have already taken CMPS 243 (Bioinformatics) or have substantial background in biology, chemistry, or statistics.

Room:
Registrar scheduled us in 371 Applied Sciences, but after the first meeting we'll be in 215 AS, which is more comfortable.
Time:
12:30-1:40 Fridays

The seminar will be a journal club, in which students take turns presenting papers from the literature. Everyone is expected to read all the papers, and to present one or two (depeneding on how many students take the course). We may also have presentations of original research, both by UCSC researchers and by vistors.

I will post a list of papers that we might want to read here, as I think of them. I welcome suggestions for other papers to read!

This quarter I plan to concentrate on proteins, particularly structure prediction, though students with a strong interest in other areas of bioinformatics can suggest other papers for us to read. Many of the more chemistry-related papers on this list have been taken from Lydia Gregoret's reading list for Chem 200A.


Tentative schedule

2 April 1999 Administrative details, choosing papers

9 April 1999 Two papers
Twilight zone of protein sequence alignments Burkhard Rost Protein Engineering 12(2):85-94, Feb 1999.

A nice article on structure prediction in the twilight zone. He revisited the HSSP analysis to see if the 20% homology rule still holds with today's larger databases, and explored some factors reflecting the reliability of twilight zone structure prediction such as the more similar than identical rule. The article has some interesting take-home messages and is very dense.

VOLUNTEER= MELISSA CLINE.

Pfam: A Comprehensive Database of Protein Families Based on Seed Alignments
Sonnhammer, E.L.L and Eddy, S.R. and Durbin, R.. Proteins 28:405-420, 1997.

Pfam, which has gone through several releases now, is the most-respected collection of protein mutliple alignments that are based on sequence data.

VOLUNTEER=MARK DIEKHANS.

The latest PFAM paper is available from: ftp://ftp.sanger.ac.uk/pub/databases/Pfam/NAR_1999_paper.pdf
Mark says "I put copies of the original PFAM paper in AS 215. The are on the table to right as you come in from the outside door. Both online papers are in ~markd/pub/pfam/*, however these are small follow-on on papers probably not of much use with out the context of the original paper."

16 April 1999
PHD: predicting one-dimensional protein structure by profile-based neural networks
Rost, Burkhard. Methods in Enzymology 2666:525-539, 1996.

Until quite recently, Burkhard Rost's PHD program was the best secondary-structure predictor around. The ones that do better now (PSIPRED and our own predict-2nd) use much the same technology, but larger training sets and better multiple alignments.

VOLUNTEER= SPENCER TU

Conservation and prediction of solvent accessibility in protein families
Burkhard Rost and Chris Sander. Proteins: Structure, Function, and Genetics 20(3):216--226, Nov 1994.

Prediction of solvent accessibility, using neural nets like those for secondary-structure.

VOLUNTEER= SPENCER TU

Protein Secondary Structure Prediction Using Local Alignments
Asaf A. Salamov, Victor V. Solovyev Journal of Molecular Biology, v 268, n 1, April 25, 1997, 31-36.

Uses local alignments and mutiple alignments with a variant of the nearest-neighbor algorithm to get a claimed accuracy higher than PhD's, but the exact details of the test are not clear, and the accuracy measure used is very sensitive to small details in the definition of "helix" and "strand".

VOLUNTEER= SPENCER TU

23 April 1999
Crystallography made crystal clear: a guide for users of macromolecular models.
Rhodes, Gale.

This book explains what is in a PDB file. We probably don't have time to cover the whole book, but Chapters 2 and 8 may be particularly relevant.

VOLUNTEER=NGUYET MANH

30 April 1999
Knowledge-based potentials for proteins.
Manfred Sippl. Current Opinion in Structual Biology 5:229-235, 1995.

Sippl has been one of the more successful practitioners of threading as a fold-prediction technique (see CASP2 and CASP3 results). I'm not sure which of his many papers has the best presentation of his techniques: the 1990 JMB papers, the 1993 Journal of Computer-Aided Molecular Design paper, or this one.

If a student is interested in threading as technique, it may be worth reading several of Sippl's papers (as well as some of his competitors' papers, and selecting the best of the group to read.

Knowledge-based potentials--back to the roots.
Koppensteiner, WA and Sippl, Manfred. Biochemistry (Mosc) 1998 Mar;63(3):247-52.

A more recent review paper---we may read just this one, if it contains the most useful information of the previous papers.

VOLUNTEER=DAVID KULP

Note: we actually ended up with two different papers by Sippl:

Manfred J. Sippl, "Calculation of Conformational Ensembles from Potentials of Mean Force: an Approach to the Knowledge-based Prediction of Local Structures in Globular Proteins", J. Mol. Biol. (1990) 213, 859-883.

Presents scoring functions for structure-sequence alignment based on statistics of distance and amino acid pairs. Takes a rather physical view of the numbers, and normalizes by using pseudocounts.

M. J. Sippl and J. Markus, "Predictive Power of Mean Force Pair Potentials" in Protein Structure by Distance Analysis, Copenhagen, 1993.

Expands on the method from JMB, '90, v213. In addition, includes z-score significance and various parameter tuning results.

7 May 1999
Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology
K. Sjölander and K. Karplus and M. P. Brown and R. Hughey and A. Krogh and I. S. Mian and D. Haussler
CABIOS 12(4):327-345, August 1996

This paper gives all the math for Dirichlet mixtures in a fairly tutorial form. Dirichlet mixtures are essential to extracting maximum information from a multiple alignment.

VOLUNTEER=CHRISTIAN BARRETT

14 May 1999
Three structure-structure alignment papers

"Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins" by Gerstein and Levitt Protein Science 1998 vol 7, 445-456

This method of structure-structure alignment directly matches the backbones of two structures, by using repeated cycles of Neddleman-Wunsch type dynamic programming and least-square fitting, to determine an alignment minimizing co-ordinate difference.

VOLUNTEER= SUGATO BASU

Overview of efficient structure-structure aligners

VOLUNTEER= MELISSA CLINE

DALI "Protein Structure Comparison by Alignment of Distance Matrices" by L. Holm and C. Sander Journal of Molecular Biology 1993 vol 233, 123--138 http://www2.ebi.ac.uk/dali/dali_jmb.html

The DALI method for optimal pairwise alignment of protein structures, using elastic similarity score between contact patterns in distance matrices and a Monte Carlo optimization for assembly of the alignments.

Two recent DALI papers from the EMBL group are available online:

VOLUNTEER= SUGATO BASU

21 May 1999 Kernel methods and support vector machines
Two papers Fisher Kernel papers are available locally on: http://www.cse.ucsc.edu/research/compbio/discriminative/Jaakola1-1998.ps http://www.cse.ucsc.edu/research/compbio/discriminative/Jaakola2-1998.ps For Support Vector Machines, there is a 43 page tutorial at: http://svm.research.bell-labs.com/papers/tutorial_web_page.ps.gz

VOLUNTEERS= DAVID KULP, MARK DIEKHANS (kernel method)

28 May 1999
"Prediction of Protein Side-chain Rotamers from a Backbone-dependent Rotamer Library: A New Homology Modeling Tool"

VOLUNTEER= CHRISTIAN BARRETT

4 June 1999
Support Vector Machine Classification of Microarray Gene Expression Data by Michael Brown, William Grundy, David Lin, Nello Cristianini, Charles Sugnet, Manuel Ares, and David Haussler
A continuatuon of the kernel-method theme, presenting new, unpublished research.

VOLUNTEER= Michael Brown


Possible other papers

Casp2 and casp3 papers
Any of the papers from the special issue of Proteins: Structure, Function, and Genetics (Supplement 1, 1997) about the CASP constest. If Supplement 1, 1999 becomes available in time, we can read papers from that also.

Structure Prediction of proteins---where are we now?
Burkhard Rost and Chris Sander. Current Opinion in Biotechnology 5:372-380, 1994.

A brief, somewhat dated overview of protein structure prediction, describing 1-d (alignment and 2ary structure prediction), 2-d (contact maps), and 3d approaches. No math, but 110 citations, some with good annotation.

Areas, Volumes, Packing and Protein Structure.
Richards, FM. Ann Rev Biophys Bioeng 6:151-176, 1977.

An early paper on the dense packing of protein cores.

Tertiary templates for proteins: Use of packing criteria in the enumeration of allowed sequences for different structural classes.
Ponder, JA & Richards, FM. J Mol Biol 193:775-791, 1987.

This paper is an early one on using rotamer libraries to do fold prediction.

Dominant Forces in Protein Folding.
Dill, KA. Biochemistry 29:7133-7155, 1990.

This paper makes a strong argument that burial of hydrophobics is the main driving force for protein folding.

Forces contributing to the conformational stability of proteins.
Pace CN; Shirley BA; McNutt M; Gajiwala K. Faseb Journal 10:75-83, 1996.

This paper makes an argument for hydrogen bonding being as important as hydrophobicity in stabilizing proteins.

Principles of membrane protein assembly and structure.
VonHeijne, G. Progress in Biophysics and Molecular Biology, 1996, 66:113-139.

A brief guide to phylogenetic software.
Douglas J. Eernisse. Trends in Genetics 1998 Nov, 14(11):473-5.

A very brief overview of available phylogeny software. (Available on-line from http://www.scidirect.com, but easier to access through http://bob.ucsc.edu/library/science/ej.html

[an error occurred while processing this directive]