click on the Biowiki logo to go to homepage
Edit Raw Print
Links Diffs RSS
About Stats Recent
Research Teaching Blog
Fall09 | Sandbox
Biowiki > Fall09 > FinalProject

Search

Advanced search...

Topics

PageRank Checker

Bio E 131 Final Project 2009

Your task is to set up and deploy a website, including a genome browser, for an RNA virus genome. The purpose of this website is to be an educational and/or research resource that serves as a portal to additional investigation of the virus, incorporating robust links to external databases, review information, and/or novel analysis.

Each student is assigned one virus; you may, at your option, team up with other students, to a maximum team size of 4. The teams are in fact just one organizational mechanism for collaborating; since every student is assigned a different virus, and since (when grading) we will reward efforts that are seen to benefit everyone's project, sharing of tools and data between teams is actively encouraged. The assigned viruses occur in phylogenetic clades and it is suggested, but not required, that the maximum benefit of teams can be derived when teams contain students whose assigned viruses phylogenetically similar.

There is a great deal of flexibility in how you execute your project, since the basic project requirement (set up a JBrowse genome browser for your virus) can and should be extended in a number of different and complementary ways, outlined below.

Students will make presentations on their projects during RRR week, following which there will be a period of peer review. During the peer review period, each student will anonymously review and rank a minimum of five projects outside their own team. These rankings will be used extensively (albeit not exclusively) to grade the projects. The mandatory five projects that each student will review, will be assigned later during the project period. Students may rank more than five other projects; this is one of several ways to accrue extra credit.

The presentations will not be graded directly but should be viewed as an opportunity to advertise the strengths of the project to peer reviewers, so as to maximize their rankings of your work.

To find out what your virus is, go here: Project Assignments

The basic project

The project involves developing a website, or more specifically, a collection of interlinked HTML files together with JavaScript files allowing your viral genome annotations to be viewed in a web browser.

As noted, you will be using JBrowse for the genome browser component; one reason for this is that JBrowse does not require any code to be run dynamically on the server when a client requests a page. Therefore, you can test your website by creating static files and directories on the DECF computers and loading these files into Firefox.

The minimum requirement for your project is to

  • select a representative genome for your assigned species of virus
  • obtain FASTA-format sequence and GFF-format gene annotations
    • this may involve writing/finding/running scripts to convert from other formats, e.g. Genbank
  • run the JBrowse preparation scripts on these files
  • test that the JBrowse browser displays in Firefox (or Safari/IE)

Completion of these requirements will amount roughly to a basic C grade. This grade can be improved by implementing the extensions below, and/or by contributing collaborative resources (tutorial pages, Perl scripts, general pages collecting links/data about genes shared with other viruses, etc.) that other class members can use to improve their projects.

Mitch Skinner, developer of JBrowse, will be available in 381 Stanley on Tuesdays from 3-5pm to answer questions about JBrowse setup and configuration.

Extensions

Extensions are very open-ended, within the general framework of providing a web-based resource for browsing viral genome annotations. Some ideas are as follows:

  • Incorporating links to other databases and resources, either from your web pages, or from JBrowse itself (JBrowse can incorporate outgoing links from the gene features, or via intermediate "link pages" that you can create, if e.g. you have more than one outgoing link from each gene feature). For example....
    • Pubmed citations
    • PDB (Protein Data Bank)
    • Wikipedia (page for your virus, or for genes in your virus)
    • Genbank
    • The RNA Virus Database in Oxford (this is an excellent source of data on many of the assigned viruses, though it will require some SQL experience in your team)
    • Databases, websites or publications specific to your virus
    • This is by no means a complete list! Credit will be given for useful and consistent cross-linking.
  • Performing bioinformatics analyses and incorporating the results of these, or other annotation data, into your genome browser, e.g.
    • Running homology searches against Swissprot (for protein matches), Interpro (for protein domain profile matches), Rfam (for RNA family profile matches), etc.
    • Aligning your genome to closely-related genomes (e.g. other students' assignments...) and incorporating tracks/statistics that summarize these alignments into your JBrowse view (e.g. plotting a column-by-column conservation score as a WIGgle track in JBrowse, using whatever particular conservation scoring metric you deem appropriate)
    • Annotating signals in the genome that have been discussed in the literature or predicted by your own analyses, e.g. RNA structures that are conserved or relevant to function, transcription factor or other binding sites, protein active sites, packaging or replication signals, etc.
    • Identifying other features of interest that you are able to find (either by your own analysis or by examining the literature), e.g. mutational or recombinational hot-spots, drug resistance mutations, etc.
    • Adding tracks showing the GC content, information content, or other sequence statistics
  • Writing overview/summary pages discussing any of the following issues at a molecular level, with reference to your genome:
    • Biology, morphology, pathology, evolution of your virus
    • Engineering applications or modifications of the virus
    • Clinical/therapeutic considerations (e.g. drug resistance)
    • Comparisons to related or other viruses (e.g. shared genes, different genes, structural or other similarities)

An A grade would likely require at least one extension in two and possibly three of the above categories, with the top grades going to projects that also win consistently good peer rankings.

Consistency and coherence will be valued in these extensions. For example, if your overview page discusses drug resistance mutations and your genome browser includes a track for drug resistance mutations, then this would be seen as a plus. If you also incorporated links to a database of mutation phenotypes for this virus, this would be a very strong plus since you would then be spanning all three categories.

Incentives for collaboration

The project is intended to emphasize several features of real science: collaboration (and the development of collaborative tools), peer review, and working with real data.

The collaborative aspect is particularly emphasized. The collective goal of this project should be viewed as developing a web-browsable database of RNA virus genome annotations. No two people are assigned the exact same virus and so there will be considerable benefits to collaboration, including collaboration between teams as well as collaboration within teams.

While there is a ranking aspect to the grading scheme, it is by no means a zero-sum game. If the collective output of the class exceeds expectations, then more high grades will be awarded than would otherwise be the case. This is designed to provide an incentive for development of community resources, and contribution to such resources may boost individual grades.

Of course, the model we are copying here is the scientific reputation economy. Scientific culture encourages you to release your secrets, rather than hoard them, because in doing so you will accrue credit in the form of reputation and citations. This culture has evolved and thrived because it creates an incentive structure that benefits humanity as a whole, while recognizing individual contributions. Of course, here we are explicitly attempting to incentivize collaboration by including this as a factor in your final grade... the principle is the same, though (we want to leverage the process to create a kick-ass collective project).

As an example, you are asked (as part of the basic project) to set up the JBrowse genome browser. This is a new and experimental genome browser and while some documentation and tutorial information does exist for this browser, it is by no means comprehensive (yet), nor are there any tutorials explicitly aimed at the level of this class. An example of how collaborative contributions might improve your grade would be if you wrote up your experiences with JBrowse as a tutorial wiki page at an early stage during the project, making this tutorial available to other class members, and (for the Win!) encouraging other students to contribute and improve your tutorial page. This is not meant to be a prescriptive example; there are many other similar options (e.g. starting a mailing list for class discussions of JBrowse, or helping answer questions on the existing gmod-ajax mailing list for JBrowse). The point is that we (the graders) will be actively looking for such examples of contributions that not only enhance one particular project, but enhance the collective output of the class as a whole. This is entirely compatible with individual grades; essentially, it reflects the "reputation economy" found in large-scale academic research consortia.

A few examples of good collaborative contributions:

  • Early stages
    • Setting up a central installation of JBrowse and its pre-reqs (e.g. Bioperl) that other teams can use
    • Development or distribution of a useful Perl script for converting file formats
    • Documentation or creation of tutorial pages
  • Later stages
    • Contributions to external user-curated resources such as Wikipedia (positive and successful contributions to Wikipedia will be held as VERY positive)
    • Providing early & detailed feedback on other student projects, and/or ranking more than five other projects
  • Overall
    • Developing common components that facilitate consistency between different team entries (e.g. cascading style sheets, common data sets or gene pages, etc.)
    • Taking a leadership role in organizing/co-ordinating collaborations between teams
    • Setting yourself up as a "service provider" to perform a particular specialized analysis for everyone in the class (e.g. running Rfam searches, doing multiple alignments, configuring or even running JBrowse, etc.)

Formal deliverables

The following are the deliverables that you must submit for grading.

  • At an early stage (see timeline) each team will be required to submit team names and member lists, which we will use to avoid conflicts of interest when assigning reviewers.
  • The address of the "landing page" for your website. This will likely be a path to a file on the DECF accounts (although you are free to host your pages as a website on a webserver if you have access to one).
    • The path MUST be accessible to other students, either locally (on the DECF computers) or globally (over the web), so that other students can review your site. Make sure you test this!
  • Zipfile or tarball representing a snapshot of your website, submitted via bSpace
    • This does not need to include portions that are posted on biowiki.org, wikipedia, etc., as long as those portions are directly linked to from your website
    • You may keep updating/modifying your site during the review period, but we would like a snapshot for our records
  • Statement of individual member contributions, including ...
    • contributions to external sites (Wikipedia, biowiki.org, etc.)
    • collaborative tools or resources developed (tutorials, Perl scripts, etc.)
  • Presentation by your team (5 minutes per team member)
  • Rankings of assigned projects for peer review (optionally accompanied by more detailed critiques)

Timeline

March 2010
  01 02 03 04 05 06
07 08 09 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

April 2010
        01 02 03
04 05 06 07 08 09 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  

  • 20 Nov - final project announced; virus Project Assignments made
  • 25 Nov - final project team names & membership lists due
  • 4 Dec - final project peer-review viruses assigned
  • 7 Dec - final project presentations
  • 9 Dec - final project presentations
  • 10 Dec - all submitted final project materials due (paths, zipfiles, statements)
  • 17 Dec - all final project peer rankings due
Actions: Edit | Attach | New | Ref-By | Printable view | Raw view | Normal view | See diffs | Help | More...