click on the Biowiki logo to go to homepage
Edit Raw Print
Links Diffs RSS
About Stats Recent


Research Teaching Blog
Fall09 | Sandbox
Biowiki > Teaching > Bio E 241 > CompbioModelsClass

Search

Advanced search...

Topics

PageRank Checker

A short probabilistic modeling course for compbio grads

(Formerly Bio E 241)

Brief description of syllabus

Catalog description:

This course reviews the statistical and algorithmic foundations of bioinformatics viewed through the lens of paleogenetics, the science of "Jurassic Park", i.e., the reconstruction of ancient genes and genomes by reverse Bayesian inference under various stochastic models of molecular evolution. Such methods, first proposed in the 1960s by Linus Pauling (and others), are now in reach of practical experimentation due to the falling cost of DNA synthesis technology. Applications of these methods are granting insight into the origin of life and of the human species, and may be powerful tools of synthetic biology. Lectures will review the theoretical content; homework and laboratory exercises will involve writing and applying programs for computational reconstruction of ancient protein and DNA sequences and other measurably evolving entities, both biological (e.g., gene families) and otherwise (e.g., natural language).

Some further questions addressed by the class...

Genome evolution and ecology

What are the mathematical dynamics of genome evolution? How does random mutation of strings generate the most effective nanotechnology known? Can we build models of communities of genomes? Map the full spectrum of evolutionary timescales? Predict the course of evolution, or direct it? Reconstruct the past?

How have others responded to these questions?

Grammars for biological sequences

The metaphor of DNA sequence as "the language of life" is so over-used as to have become trite. Yet, remarkably, there are deep mathematical parallels between the information structure of DNA and that of natural language. How do these similarities arise? How can they be measured and put to use, particularly in high-throughput mode?

This class examines these questions from the point of view of someone interested in developing probabilistic modeling algorithms from principles of evolution and biophysics.

We will also examine several other kinds of probabilistic model, useful both in bioinformatics and in other areas of applied machine learning.

Probabilistic modeling

Probabilistic methods - including graphical models, HMMs, Gaussian processes, stochastic grammars, Markov random fields, Dirichlet processes, etc., along with associated algorithms such as Markov Chain Monte Carlo, Expectation Maximization, variational Bayes, etc. - are a mainstay of modern computer science applications. They are natural successors to earlier classes of approach such as expert systems, artificial intelligence and neural networks.

One area in which probabilistic methods have made a particularly strong impact is computational biology. Studying probabilistic algorithms in the context of molecular biology offers a uniquely interesting background to these methods. Not only is the probabilistic analysis directly transferable to other applications in CS and scientific informatics, but the the application to post-genomic biology provides an entry point into such breaking areas as synthetic biology, human genome evolution, molecular ecology or gene circuit analysis.

This class will develop probabilistic modeling techniques, particularly time-evolving random processes and stochastic grammars. A strong emphasis on underlying theoretical techniques will be complemented by reference to working code that can be applied to real problems in phylogenetics and genomic analysis.

A central theme of the course is the increasingly popular use of evolutionary grammars as a foundation for genomics algorithms (see PhylogeneticAlignmentReader). We will also cover other aspects of Stochastic Biology.

Extensions

Each year some new material is added to the class. Here are a few possibilities for this year:

Email IanHolmes if you have preferences among these, and/or wish to lead discussion of paper(s) on this topic for class credit.

-- IanHolmes - 06 Jun 2009

Actions: Edit | Attach | New | Ref-By | Printable view | Raw view | Normal view | See diffs | Help | More...