CMP242 Project Ideas
Spring 1998

First, here are some project topics from previous classes (this includes projects from CMP290D, Neural Networks, so there are a more neural net projects than one would expect):
  1. Mark Brautigam: using decision trees to predict the next note in a Bach fugue.
  2. Clark Steinback: training a neural net to lip read using Dom Massaro's talking head (see more about the talking head below).
  3. John Panzer: identifying splice sites in humna DNA sequences. (These are places where the genes are interupted by "introns". This is an extremely important problem, now that there is a large scientific effort to sequence the human genome. We'll provide data sets for this.)
  4. Raksas Hang: Using decision trees to predict what amino acid will appear in a certain position in a protein structure, based on the physical characteristcs of the protein at that position.
  5. David Kulp: discovering "motifs" in protein sequences using data compression/information theory methods. Motifs are short patterns that occur (with some variations) in within many protein sequences, and usually serve some important functional role from a biochemical viewpoint.
  6. Kimmen Sjolander: representing properties of amino acids with Dirichlet mixture densities, a type of Bayesian modeling.
  7. Melissa Cline: using Bayesian methods to predict whether or not two amino acids will be close enough to interact in a protein structure.
  8. Brian Chess: using a neural network to diagnose faulty circuits
  9. Bruce Sherrod and Alejandro Borgia: Comparing decision tree methods with exponentiated gradient methods to predict disk idle times. This is part of a system developed by David Helmbold and Darrell Long to adaptively decide whether to spin down a disk to save energy or keep it going in anticipation of getting a disk access in the near future. The ML part was essentially to learn to predict the latency time between successive disk accesses.
  10. Hal Duncan: Implementng and testing the hierarchical mixure of experts method of Jordan et. al. using synthetic data. Compared to some other methods.
  11. Giulia Pagallo: using decision trees to learn Boolean functions in disjunctive normal form (she later did a PhD on this) (many other students have used decision tree learning methods on various data sets, e.g. a Chem grad used them with some success to predict whether certain drugs would bind chemically to their target molecules.)
  12. Joel Dewitt: modeling self organization of early visual system
  13. Joel Dewitt: improving the capacity of Hopfield neural nets
  14. Brad Gluko: face recognition by encoder neural net
  15. Aaron Ferrucci: feedforward neural net to play tic-tac-toe
  16. Ian Barland: 2d pole balancing adaptive control system
  17. Kyle Smith: cat-and-mouse game of pursuit in a 2d world with obstacles (reinforcement learning)
  18. Tae-Soon Park: signal processing with a feedforward neural net
  19. Steve Cook: recurrent neural net that plays music
  20. Judy Leslie: classification of data collected from high energy accelerator physics experiments with a feedforward neural net
  21. John Chachere: function approximation with generalized radial basis kernels and wavelets (theory project)
  22. Haluk Konuk: Learning in feedforward neural nets using phantom targets
  23. Jean Cailton: Using second order gradient information to improve learning rates in feedforward nets (theory project)
  24. Peter Hughes: genetic "breeding" of better learning feedforward neural nets
  25. Jeanne Rich: learning to control a robotic arm
  26. Koji Amakawa: teaching a robot arm to reach while avoiding obstacles
  27. Koji Amakawa: Inverse problems involving electrical waves in heart muscle (worked with Alex Pang).
  28. Jason Zien (followed this suggestion of Darrell Long:) Use a neural net to learn to predict the time it takes the read/write head of a disk to move from one track to another (called {\em seek time}). This would have important practical applications in disk scheduling.

Suggestions:

  1. Several of the most successful previous projects fall under the category of what I call the generic project for 242. In this project you implement a machine learning method and use it in some application area. The input might be patient symptoms and the output might be a prediction or whether or not the patient has cancer. Or it might be like Jason Zien's project above. The possibilites are endless. In the generic project, you get some data, divide it into training and tests sets, use the training set to train a net, and use the test set to test the generalization performance of the learned decision rule to new data. Look on the CMPS242 Machine Learning Resources Page for ml methods and datasets you can use in your project. I expect most people to do projects of this type. Generally they are the most successful. Don't think it is too easy! You have to do a lot of experiments, and have a good intuition/knowledge about learning methods to get them to work on real datasets.
  2. (Darrell Long; Pat Mantey) Use a machine learning method to analyse some of the weather data from the REINAS project. Goal could be short time weather prediction or adaptive control of weather sensing devices. This is an important $4,000,000 project here at UCSC and other facilities in the Monterey Bay. Thye are already using machine learning methods. Give them some better ones!
  3. (Darrell Long; David Helmbold; Charlie McDowell) Use machine learning methods to create an adaptive algorithm that improves some aspect of computer system performance, like Jason Zien's project and Bruce Sherrod and Alejandro Borgia's projects mentioned above. There are many other performance issues that could be handled adaptively. A related problem is adaptive user level software. For example, there is a project at UCSC called Aida, whose goal is to build an affordable, intelligent electronic secretary. See the Aida project web page and talk to David, Charlie and Darrell for more ideas.
  4. Learning on the internet: There are lots of fun (and profitable) ways to apply learning methods to the internet to do things like improve the performance of search engines, design recommendation systems, etc. (An example of a recommendation system is a web site where people go to get a recommendation for what movie to see. When you get there, you type in what movies you have seen and how you liked them, then the system picks out a movie or two you haven't seen, but would probably like, and recommends them. It does this by comparing your information to a large database of information from other users of the site, using pattern recognition/ml methods to find something that you haven't seen, but that was seen and liked by other people who have also seen and liked the same movies as you. Of course, in the course of handling your request, the system collects info from you that increases its database and thus helps it to get better at making recommendations in general.) Our friends at ATT Bell Labs (Rob Schapire schapire@research.att.com and Yoav Freund yoav@research.att.com and others there) may have some specific suggestions if you write to them. They are doing great work in this area. You may get a job offer.
  5. bioinformatics: With the advent of the human genome project, there are now many exciting datasets consisting of human DNA and DNA from other organisms. Design methods to find genes and other important features in the data (contact David Kulp (dkulp@cse) Betty Lazareva (betty@cse) or Birong Hu (bhu@cse)). There are also large databases of protein sequences. Many of the projects listed above used that data. Contact Kevin Karplus (karplus$cse) Richard Hughey (rph@cse) Christian Barrett (cbarrett@cse) or Melissa Cline (cline@cse). Predicting the structure of RNA sequences is also an important problem (see Michael Brown (mpbrown@cse)). All of these people have ongoing research that you could work on, doing machine learning aspects of the research for your class project. Arun Jagota (jagota@cse) also works in this area, and will be preparing some special datasests for class use.
  6. Ideas from Arun Jagota: PROJECT 1: Modify various popular (discriminative) algorithms for binary classification to incorporate asymmetric misclassification costs. Specifically given two classes, 0 and 1, let C_0 and C_1 denote the costs of misclassifying a 0 as a 1 and a 1 as a 0 respectively. Most existing algorithms simply assign a unit cost for either type of misclassification. Modify such algorithms to exploit asymmetric costs. Compare the performance of these modified algorithms with those of the original ones on some problems (e.g. certain ones in biosequence analysis) in which C_0 and C_1 can be very different.

    PROJECT 2: Experimentally evaluate a clustering algorithm of Tishby et al (NIPS*97) that takes two sequences and determines the likelihood that they have a common ancestor. This algorithm has potential applications to biosequence analysis.

    PROJECT 3: Machine Learning Applied to Combinatorial Optimization. Many combinatorial optimization problems are apparently intractable (NP-hard). Yet many arise in applications and therefore need to be solved (at least approximately). This is fertile ground for heuristic algorithms, which usually find adequately good solutions very quickly. One largely untapped source of new heuristic algorithms for combinatorial optimization is based on ideas from machine learning. The principle is as follows. Design a combinatorial optimization algorithm with some adaptive parameters. Let the algorithm learn the structure of the problem, as it evolves, and capture this learning into its parameters. Hopefully such an algorithm will improve its performance as it evolves. A few concrete instantiations of this idea, based on reinforcement learning principles have been tried in the recent past (Jagota et al; Others). Empirical studies have demonstrated that adaptive (i.e., learning) versions of the algorithms consistently outperform nonadaptive ones. This lends some credence to this idea. This project will involve the student to survey and understand the handful of previous papers on this topic, and to expand the frontier of this research direction. (Designing and implementing new algorithms of this type; solving new problems with this approach; or improving existing algorithms). Arun has access to a lot of combinatorial optimization benchmark problems, and a lot is known about these problems, e.g. optimal solutions, how other algorithms perform on them). Most are at a DIMACS archive that he can refer you to.

  7. (Bob Levinson) Use a machine learning method to try to classify chess endgame positions as won or lost. (Ross Quinlan uses decision trees to do this in the second paper in the class reader.) There are chess endgame databases with classified examples available in the UC Irvine database, and Bob may have some handy as well. Bob also suggested trying to solve the 8-puzzle. An interesting technique to explore here is reinforcement learning. Here the learning algorithm is not told explicitly what it should have done for each instance that it is trained on, but only how well it did according to some performance measure, and even this feedback does not come until many steps have been taken. Systems that learn to plan, and to play games, often have only this type of feedback, e.g. you aren't told what move you should have made at each step, you only find out whether you win or lose. In Bob's research on chess, he uses a specific technique called "temporal difference learning". I have some references if you are interested.
  8. (Manfred Warmuth) There are various algorithms that Manfred Warmuth, David Helmbold, Jurki Kivinen and Nick Littlestone have developed that change weights multiplicatively, instead of additively as in the typical gradient descent method. It would be nice to test these. Manfred and David have the details. Manfred will discuss these in some guest lectures, I hope. I have also done some work with them on this.
  9. (Sharon Daniel and Dom Massaro) Professor of Psychology, Dominic Massaro at the Perceptual Science Laboratory at UCSC has developed a completely animated synthetic talking head . The talking head can be heard, communicates paralinguistic as well as linguistic information, and is controlled by a text-to-speech system.

    Professor Massaro and Assistant Professor of Art, Sharon Daniel, are collaborating to develop the talking head into a conversational agent in dynamic, user-defined, virtual environment. The conversational agent will interact with the human user in the most natural manner possible including the ability to listen and understand, as well as speak fluently. The agent's virtual environment will avoid the rigid "desktop" and "page" metaphors of current graphical user interfaces and offer a more flexible, dynamic, graphical user interface. This agent/environment will be user-defined and fully interactive -- providing a new model for intelligent, programmable, Graphical User Interface design.

    A hypothetical "pathological" agent is one possible content model for the development of the conversational agent interface. Artificially Intelligent pathological "agents" are extensions of the inherent clinical pathologies of a "normal" personality. Each individual personality incorporates, to a greater or lesser degree, all the traits which have, to date, been defined as pathologies in clinical psychology -- for example, obsessive-compulsive disorder, paranoia, and schizophrenia. A "normal" personality merely exhibits these "disorders" to a degree that is acceptable within a given social or cultural context.

    The manifestation of an individual's pathological traits through artificial intelligence - as Minsky-like conversational or behavioral "agents" -- might allow an individual to "safely" explore and extend those aspects of his or her personality and to question the cultural interpretation of behaviors as "disorders." A pathological agent based built as a neural network or using machine learning methods might, in addition to any possible research or therapeutic function, provide interesting conversation.

    We welcome any experiments with or assistance in building this agent model.

  10. (theory) Try to find a simple, incremental learning rule for the perceptron (see Duda and Hart book on reserve) that will find a separating hyperplane whenever on exists and never take time exponential in the number of examples or the dimension of the space.
  11. Explore algorithms that learn to predict the next symbol in a discrete time series. Examples include text, speech, music and genetic sequences. Humans are pretty good at this. If I say "put out the c ..." you can guess what the next letter might be. The problem is to learn to extract features or remember "chunks" from the previous text that will be useful in predicting the next letter. People have done this with simple Markov models, hidden Markov models (see paper by Rabiner which I will put on reserve.) (Lempel-Ziv type data compression schemes can also be used.) Once you get something running, you can let it create its own text (or music or ...) by starting it off with some context and letting it predict the rest, using its own predictions as further context. If you have two different prediction methods then you can let them have a "conversation". When the first speaker gets to the point where it is not sure what should come next, the other picks up from there, using the recent letters that were generated by the first speaker as context.
  12. Like Peter Hughes' project, explore the interaction between learning and evolution by selectively "breeding" populations of decision trees or neural nets that learn to do some task in order to survive. The ones that perform best on the task can "mate" to produce hybrid offspring, and these can be further tested, etc.
  13. What happens if the above decision trees or neural nets can also communicate with each other and share knowledge?
  14. design learning algorithms that are "active" in the sense that they create their own training instances. This corresponds to scientists doing experiments to learn about nature.
  15. What if the distribution generating the examples is slowly changing over time, while you are trying to learn a decision rule. In this case the optimal deecision rule also changes slowly. What learning algorithms are better than others in these cases, and how can they be adapted to handle this situation?
  16. Success in learning can depend heavily on the way that you choose to represent your training instances. If you use low level features, such as pixels in an image of a character for character recognition, then the learning task is hard. If you use higher level features, such as the presence or absence of edges of various orientations in the image, then the learning task becomes easier. In most learning methods, when you are representing a single real valued input, it can make a difference if you represent it as a single input feature, or if you break it up into several binary input features, each of which tells whether or not the real valued input was in a certain range. Explore the effect of chosing different input representations on a single learning problem and/or a single learning algorithm.

Questions regarding about page content should be directed to haussler@cse.ucsc.edu.
Last modified April 2, 1998.

Back to the CMP 242 Class Page.