Genome Wiki

From Biowiki
Jump to: navigation, search

Genome wiki

The idea of a genome wiki was articulated by Salzberg &: Genome re-annotation: a wiki solution?. Genome Biol. 2007;8:102.

Here, it's meant in the sense of a web-based genome browser to which annotation tracks can be uploaded, persistently, so that they can be shared with other users.

This page presents a single point of view on what a "genome wiki" should be, acknowledging that the term is ambiguous and the concept somewhat flexible.

Ideal properties

A wiki would allow the community of experts to work out the best name for each gene, to indicate uncertainty where appropriate and to discuss alternative annotations. - Salzberg &: Genome re-annotation: a wiki solution?. Genome Biol. 2007;8:102.

An ideal genome wiki should have the following

  1. core wiki functionality: a wiki-wiki, web 2.0 sort of feel, empowering community collaboration
    • powerful search
    • ability to upload, share, discuss, tag & edit track annotations
    • management of user accounts
    • social-networking tools: who else is working on, or near, this genome?
  1. bioinformatic granularity: the UI should offer per-feature operations (e.g. on individual genes or exons) as well as per-track or per-genome operations:
    • simple, fast, fluid, responsive, real-time genome browser interface
    • editing of features from within the genome browser; track & feature merge operations
    • browsing of feature & track revision histories
    • appropriate consortium-oriented access controls (privacy, sharing, approval)
  1. robust database properties
  1. portability
    • open source
    • client works in any web browser
    • server is agnostic to hardware platform
    • well-documented (for users and developers)
  1. compatibility
    • compatible w/standard web apps & protocols (e.g. PageRank, RSS, OpenID, Wikipedia, deep search, semantic web...)
    • compatible w/standard bioinformatics formats (GFF, BED, WIG, MIAME, etc.)
    • close integration with established databases (UCSC, Ensembl, Wikipedia) and terminologies (GO, InterPro, etc.)

Pragmatic approximations

The simple JBrowse JBrowse.TWikiPlugin makes a game effort at being a wiki with genome-browsable attachments, but takes only baby steps toward bioinformatic granularity (an attachment is essentially a track, and you cannot edit or operate on individual features, except using an external text editor) or database infrastructure (it's filesystem-based, though it does use some cool dataset-indexing structures in the jbrowse part).

Genboree scores well on most points; much more evolved than the simple jbrowse plugin in terms of transactional & user infrastructure, though arguably less fluid than jbrowse's minimalistic interface, and perhaps less portable (in terms of being unusual or difficult to install on a new system, since it appears designed to run mostly from one central site; not so much a disadvantage as a different operations model, c.f. blogger vs wordpress).

Not to be confused with

The name "genome wiki" is ambiguous. Others have interpreted it to mean a "gene wiki": a wiki-fied version of a page-based web database.

This could be a wiki whose pages discussed genes in an individual genome (c.f. the Fly Base gene pages) or entries in a database (such as Pfam families, or Gene Ontology terms).

For example:

In fact this seems like a fair subset of what Salzberg was discussing ("work out the best name for each gene") and illustrates the importance of interoperability between such web applications, due to the need to switch between multiple views on the data.

In the idealized genome wiki, discussed here, a genome browser -- with the ability to drag around tracks and compare different sources of data -- should be a bit more central to the interface than it is in (say) Wikipedia.

Many genome browsers (UCSC, GBrowse) allow transient uploading of ones own annotations for comparison with the reference annotations, but this is different from persistent upload (where the tracks can be published or privately shared with others). A "wiki" implies persistence of uploads and edits.

The UCSC genome browser wiki is another, more literal, interpretation of the phrase "genome wiki":

--- -- Ian Holmes - 20 Mar 2009 - with thanks to --


Andrew Su points out that connections to existing databases are critical to build momentum:

the biggest thing is integration with an existing critical mass of users (because growing one from scratch is tough). So, how about links to/from the UCSC and Ensembl genome browsers? -- Andrew Su (via friendfeed), 23 Mar 2009

I agree, and have added this explicitly next to "existing web apps" and "existing databases". - Ian