|
| BioE131/231
Introduction to Computational Biology.
Course info
- Code: Bioe131/231
- Title: Introduction to computational biology
- Instructor: Ian Holmes (office hours: Wednesdays (11am-noon) + Fridays (1-2pm), 374C Stanley Hall)
- GSI: Allison Berke (office hours: 11am-noon Mondays, 12-1pm Fridays, 6112 Etcheverry, and by appt (berke at berkeley.edu ))
- When:
- Lecture log
Announcements
- Final (with solutions) posted here: FinalExam
- Links for in-class exercises (including handouts): Broken Telephone Tree, Data Compression Exercise
- The midterm grades are now posted on bSpace and the answer sheet is at the following page: MidtermExam
- The final presentations will be held in 180 Tan Hall on Tuesday December 16 from 12:30-3:30pm
- Please edit the BadLinks page to describe any problems that you find with uninstalled programs or broken links when you are doing the labs, and I will try to fix them
- The JGI tour is coming up -- please sign up on that page if you're even only vaguely interested in going -- JGI needs an estimate of numbers to book an appropriately sized bus
- Congratulations on doing the midterm exam -- now, please take a moment to complete the midterm survey and give us some feedback!
- Final project topics now online...
- Note the room change to 458 Evans Hall!
See also the BioE131 fact sheet.
- Key dates
- Academic calendar (Fall 2008)
- Midterm exam: Wednesday October 29, 3:10-4pm
- Final project topics will be announced at this time
- JGI tour: Friday November 14, 2-4:30pm
- Final exam: Wednesday December 10, 3:10-4pm, 458 Evans Hall
- Final project presentations: Tuesday December 16, 12:30-3:30pm, 180 Tan Hall
- Final project reports due 11:59pm Tuesday, Dec. 16th; please post them on your wiki page (with no viewing restrictions) in order to turn them in.
- Grading scheme
- 50% homework assignments
- Lowest homework grade will be discarded
- 15% midterm exam
- 15% final exam
- 20% final project
- Extensions/Alternate Exam Dates
- Requests for extensions on homework due dates must be submitted to Ian Holmes via email at least 2 days before homework due date, clearly stating reason(s) for request.
- Please plan to attend the scheduled exam/presentation times. Requests for alternate exam/presentation dates must be submitted to Ian Holmes via email as soon as possible, clearly stating reason(s) for request. An alternate date request is only likely to be granted under extreme circumstances (family emergencies, major illness, etc.)
- Teamwork
- We want to encourage you to work together, so for regular homework assignments, you may submit jointly with up to one other student as long as you identify who it was you worked with. However, we also want to encourage mixing, so you can contribute no more than three homework assignments with any one partner. After that you will have to rotate with someone else.
Student wiki
Notes and handouts
Slides
DISCLAIMER: The Powerpoint slides below were created using Microsoft Office on an Apple Mac.
They may not display properly on a Windows PC, or on OpenOffice.
Periodically I receive requests to generate PDFs from these powerpoints, but unfortunately this is not a feasible option.
The reason: I do not know of any conversion program that can convert Powerpoint files into PDFs in "batch" (command-line) mode.
Thus, every PDF would have to be generated manually, and this would have to be repeated every time a change was made to the Powerpoint slides.
The PDFs would then have to be uploaded to the webserver.
Inevitably, the PDFs would sometimes lag behind the PPT versions, creating huge potential for complexity and confusion.
Therefore I am making the slides available in their original (Mac) form only.
If you cannot read the slides, and you did not take notes during class, please find someone in the class who has a Mac and ask them to do the conversion for you.
I apologize for the inconvenience, but given that the slides are essentially provided as a courtesy, I simply cannot commit to generating separate versions for people with PCs;
I have only limited time/resources, and keeping the slides current and updated is a higher priority than keeping them platform-independent.
Computational virus design
Scripting compbio applications
DNA pattern recognition & analysis
Genome and pathway databases
Information content of DNA
Other material
Some material is presented on the chalkboard rather than the projector, so there are no Powerpoint slides for these topics.
Wikipedia has a lot of material. For example:
- Biology & biophysics:
- Probability & information:
- Computational systems biology:
- Sequence analysis:
- etc. (please feel free to start your own page of links!)
Syllabus
Approximate sequence of lectures (see also BioE131 weekly schedule):
- Introductory case study
- Overview of syllabus; available means of assessment..
- group & individual presentations; literature reviews; class participation; homework; exam(s); project
- Review of fundamental molecular biology
- Biophysical principles of RNA and protein folding
- Overview of biological databases
- Introduction to Unix
- Introduction to Perl programming: loops, variables, subroutines; file manipulation; data structures
- Assemblers, compilers, interpreters & virtual machines: machine code, C, Perl and Java
- Biophysics of synthetic biology: RNA folding kinetics & viral genome design
- Sequence alignment algorithms: Needleman-Wunsch, Smith-Waterman, Gotoh, BLAST
- Genome annotation; biological ontologies, pathway databases
- Probabilistic inference; Bayes' theorem; experimental error; expectation and variance
- Quick refresher in basic distributional analysis...
- Basic combinatorics; binomials and multinomials
- Geometric, exponential, Poisson distributions
- Gaussian distribution; mixture distributions
- Quantitative measures of information; illustration via data compression
- Log-likelihood ratios and substitution matrices; coding & cryptography
- Probabilistic models for sequence motifs; "sequence logos"
- Algorithmic complexity & "big-O" notation: examples from compbio
- Finite state machines; multiple alignment; phylogenetic reconstruction
- Rate variation, evolutionary trace and phylogenetic profiling; applications to design
- Structural biology, protein structure prediction, protein design
- Clustering algorithms: K-means, K-medians; application to microarray data analysis
- Sequence assembly & metagenomics; examples (human microbiome; bioenergy)
- Guest lectures: computational biology at Berkeley
Lab practicals
- Unix (9/1 and 9/3)
- Using Wiki (9/8 and 9/10, homework due 9/17)
- Biological Databases (9/15 and 9/17)
- Perl Basics (9/22 and 9/24, homework due 10/1)
- Perl Hashes & Arrays (9/29 and 10/1, homework due 10/15)
- Perl Pattern Matching (10/6 and 10/8)
- RNA folding (10/13 and 10/15, homework due 10/22)
- Sequence Alignment (10/20 and 10/22, homework due 10/31)
- Information Content of DNA (10/27 and 10/29, homework due 11/5)
- Bacterial Gene Prediction (11/3 and 11/5, double homework due 11/21)
- Primate Phylogeny (11/10 and 11/12)
- Pathway Mining (11/17 and 11/19; diffusion limited aggregation homework due 12/3, phylogeny homework due 12/10)
- Protein Visualization (11/24 and 11/26)
- Catch-up lab; project work
Homework exercises
Homework exercises will be assigned in labs and posted on the individual lab pages.
Programming assignments will be graded both for form (style) and function (correctness). Stylistic expectations will be outlined on the style guidelines page by the time the first assignment is given.
Textbooks
No textbook purchase is required to take the class.
References to the following textbook (which can be freely downloaded) appear occasionally as recommended reading:
- The MacKay Book: MacKay, DJC. Information Theory, Inference and Learning Algorithms. ISBN:0521642981
- Can be downloaded from here (copyright grants permission to view but not print)
The following Perl books from O'Reilly may be a useful supplement to what's taught in class:
- The following are the two "classic" Perl tutorial and reference books:
- The "Perl for Bioinformatics" series have more bioinformatics-oriented examples:
Other resources
|