|
|
Notes on our AJAX GBrowse RECOMB 2007 poster
EDITING OF THE POSTER CONTENT IS NOW CLOSED
FINAL POSTER:
We presented at Session A:
- Sunday, April 22: 5:30 pm – 7:30 pm
- Monday, April 23: 7:30 am – 5:00 pm
See also
Poster final draft (FOR ARCHIVING ONLY, DO NOT CHANGE, it will have no effect because it is being printed already)
HEADER
- AJAX GBrowse: Community Genome Annotation Made Easy
- Mitchell E. Skinner1, Andrew V. Uzilov1, Chris J. Mungall2, Lincoln D. Stein3, and Ian Holmes1
- 1 Department of Bioengineering, University of California - Berkeley, Berkeley, CA, USA
- 2 Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- 3 Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- (mitch_skinner@berkeley.edu, andrew_uzilov@berkeley.edu, cjm@fruitfly.org, lstein@cshl.org, ihh@berkeley.edu)
INTRODUCTION/BACKGROUND
The continued and growing popularity of web-based genome browsers such as the UCSC Genome Browser [1], Ensembl [2], and GBrowse [3] indicates a community need for tools that aid accessing and visualizing vast amounts of genomic annotation data. Part of their popularity is their accessibility; a web interface
does not require users to install or configure local software. However, traditional CGI-based web applications suffer from significant user interface limitations that impede their usability. We describe here a version of the GBrowse genome browser that overcomes these limitations using modern web technologies, while at the same time encouraging users to contribute their own annotations to genomes in a community-driven fashion.
User Interface Enhancements
Problems with current web-accessible genome browsers:
- Limitations of page-based user interface
- lag between user requesting an action (e.g. moving to a new genome location) and the browser performing this action
- entire page reloads after every action
- A lot of server overhead: entire page, including all the graphics, re-rendered by the server even after minor actions/changes
Desired improvements:
- The user interface needs to be smoother, and faster. Only those parts of the page that change should be reloaded.
- The client should behave as a dynamic application rather than be passive displayer of static pages.
- As much work as possible should be offloaded onto the client in order to relieve the server of computational overhead.
The philosophy we bring to these improvements is the recently-popularized AJAX (Asynchronous JavaScript And XML) approach.
Community Annotation
Recently, flaws with the rigid nature of annotation repositories such as GenBank have been pointed out [4,5]:
- They do not provide frameworks for collaborating on constantly changing data, but rather are a library that is hard to update with new material.
- Only the original submitter is allowed to change the data, making fixing mistakes found by others difficult.
- When community annotation is possible (e.g. GeneRIF [6]), the community annotations are often ghettoized, marginalized, or otherwise deprecated.
We prefer a genome wiki which encourages collaborative genome annotation, with abilities to:
- Upload your own features/annotations (and have that data persistently remain).
- Modify the annotations of others.
- Add/modify information about annotations/features.
- Link the genome browser to a wiki for each of the features.
Our goal is to create such a genome wiki framework that will use an AJAX user interface.
CURRENT STATUS
We have implemented a prototype AJAX genome browser by extending the Generic Model Organism Database (GMOD) Project’s GBrowse open-source genome browser framework [3].
Implemented features:
- Genome views are all pre-rendered, storing them as images on the server to eliminate live rendering overhead/delay.
- There is much less server overhead during user interaction.
- The client (JavaScript code running in the user's web browser) provides the following, all without page reloading:
- dynamic dragging, scrolling, and zooming of genomic views
- toggling and rearranging of feature tracks
- switching of chromosomes/scaffolds
- other user interface controls that operate without causing the page refresh
- genomic features can be clicked to open a menu containing detailed feature information
- toggling feature tracks to feature density plots if the features are too dense
- Feature upload capability - users can upload a GFF file with feature data, which renders and appears as a new track.
It is possible to pre-render large genomes in a realistic timespan on common hardware.
For example, 16 tracks of annotation (at all zoom levels) of the 28-megabase Drosophila melanogaster chromosome 3R take less than 4 hours to pre-render on a single 2.2 GHz AMD Opteron, using less than 800 MB RAM.
Rendering time scales linearly with the size of the chromosome; memory usage scales roughly linearly with the number of features being rendered, but the 800 MB space requirement above includes a substantial constant factor.
It is also possible to parallelize the rendering by dispatching each track/zoom level combination as a separate job on a cluster.
When split across a cluster of 16 processors, chromosome 3R rendering was reduced to about 1.5 hours (the rendering time of the slowest track/zoom level combo is 43 minutes).
AVAILABILITY
IMPLEMENTATION
Pre-rendering
- BioPerl's Bio::Graphics libraries do genome-wide layout of the track's features.
- We iterate through the track in small tiles, rendering only the graphics primitives that overlap it.
An alternative approach to avoid pre-rendering
Since pre-rendering may require a non-trivial amount of up-front processing, we have explored the possibility of rendering tiles on demand:
- Features are laid out using Bio::Graphics as before.
- We intercept all function calls to render GD graphics primitives and store them in a database.
- When client requests a tile, graphics primitives overlapping that tile are fetched and the tile is rendered on demand.
This approach slows down the user experience compared to complete pre-rendering, but the rendered tiles are saved for later, so there is no delay on a subsequent viewing.
Feature/track upload by users
- Uploaded tracks are persistent (reside on the server indefinitely).
- Updates of feature data can be performed by re-uploading to the same track.
The rendering process:
- The user uploads new data in GFF format.
- The server stores the data, creates a new browser track for it, and begins rendering the track's tiles.
- Once the furthest-out zoom level (which is the fastest to render) is finished, the server redirects the user back to the genome browser, which displays the new track along with the existing tracks.
- The server then continues rendering the remaining zoom levels in the background; the user can view whatever data is rendered without having to wait for all of the zoom levels to be done.
Libraries and frameworks used by AJAX GBrowse
FUTURE GOALS
To build a complete framework providing web services for collaborative, democratic genome annotation by the biology community.
- Port all current GBrowse features (such as search, bookmarking, and richer user interface options) to our framework.
- Improve feature upload and management of uploaded features (e.g. allow users to merge, subtract and compute intersections across uploaded sets)
- "Point and click" addition of new features to community-annotation tracks, to allow for spontaneous annotations.
- Integrate our genome browser with a wiki framework, such that each feature is linked to a wiki page that allows collaborative discussion about it.
REFERENCES
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, and Haussler D. The Human Genome Browser at UCSC. Genome Res. 2002, 12, 996-1006.
- Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E. Ensembl 2007. Nucleic Acids Res. 2007, 35:D610-617.
- Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S. The generic genome browser: a building block for a model organism system database. Genome Res. 2002, 12:1599-1610.
- Wang K. Gene-function wiki would let biologists pool worldwide resources. Nature 2006, 439:534.
- Salzberg SL. Genome re-annotation: a wiki solution? Genome Biology 2007, 8:102
- GeneRIF - Gene Reference Into Function. http://www.ncbi.nlm.nih.gov/projects/GeneRIF
- Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, Lehvaslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 2002, 12:1610-1618.
ACKNOWLEDGEMENTS
IH, AVU and MES were funded in part by NIH/NHGRI grant 1R01GM076705-01.
FIGURES
- Fig 1: Navigation using the "classic" CGI-based GBrowse interface versus the AJAX GBrowse interface. Clicking on a scroll button to change the view causes a delay in the CGI-based GBrowse because the server has to render a new HTML page and graphics on demand. In contrast, the AJAX GBrowse approach fetches and caches surrounding tiles, allowing the user to drag the view without delay or page refresh.
- 4-panel
- top shows "before", bottom shows "after"
- an hourglass for the "classic GBrowse" denotes delay going from top to bottom
- left: original GBrowse
- right: AJAX GBrowse screenshot, with the "grabbing hand" icon indicating dragging
- see:
- Fig 2: Feature upload interface in action.
- feature upload in action - a prominent arrow will go from "before track upload" to the "after track upload" panels
- 3 panels:
- left panel: AJAX GBrowse before a new track is added
- middle panel: track upload screen with file selection box open
- right panel: AJAX GBrowse with the new track
- the sole point of this figure is to have an attractor to the short caption underneath that will describe the feature upload in brief... it is much more likely a passing viewer will read figures and captions rather than the text, so we should try to summarize our key points in the captions
- see:
- Fig 3: Client components and the drop-down menu for clickable features. Each tile has a PNG file containing its graphics and an HTML file storing feature information for the drop-down menu.
- shows a screenshot of the client with drop-down menu popped open
- graphically defines:
- tile
- track
- feature
- clickable feature box
- see:
- Fig 4 [MAYBE - space and time permitting]: Feature density toggle. If the features in a track become too dense, the track is switched to a feature density plot instead. The density cutoff is user-configurable during rendering time.
- shows before-and-after describing feature density plot toggle
- Fig 5: Components of the AJAX GBrowse framework. Green: modules specific to AJAX JBrowse. Brown: GMOD's GBrowse and Bio Perl? modules. Blue: other programs/frameworks. Tan: documents/files.
What we submitted/got accepted
This is what appears on the RECOMB 2007 poster presenters page:
Enhancements to the GBrowse Genome Browser
Andrew Uzilov, Mitchell Skinner, Chris Mungall, and Ian Holmes
Keywords: genome browser, genome annotation, AJAX, genomics
1. Introduction.
The continuing increase in genomic data requires computational tools that biologists can use for visualizing this data. Genome browsers that provide a Web interface (such as the UCSC Genome Browser, Ensembl, GBrowse, and others) are a popular example of such tools. However, there are some problems with the user interface of current Web-based genome browsers that diminish their usability. For example, with all current CGI-based browsers, there is a lag between the user requesting an action (such as moving to a new genome location, zooming in/out, changing view properties, etc.) and the browser performing this action. This delay is caused by the server load involved in re-rendering the entire page, which must occur even if only a small component is changed.
It is desirable to improve the user interface to make it smoother and faster, such that no page reloads are required; i.e. the genome browser behaves as a dynamic application rather than a sequence of static pages. The client should be made to do as much work as possible, relieving the server of computational overhead and maintaining state, except to provide snippets of data asynchronously when requested by the client. This is the recently-popularized AJAX (Asynchronous Java Script And XML) approach.
We have extended The GMOD (Generic Model Organism Database) Project’s GBrowse [1] open-source genome browser framework to pre-render all genome views, storing them as static images on the server and eliminating the live rendering overhead, and completely rewrote the client in Java Script to have a dynamic, draggable interface that enables continuous motion without delay to the user. The pre-rendering is accomplished by using the BioPerl [2] Bio::Graphics framework used by GBrowse, writing entire chromosomes to a large virtual canvas that is broken up into tiles of a manageable size. We therefore re-use the sizeable and thriving GBrowse codebase for glyph layout (which provides solutions to issues such as “bumping”), semantic zooming, rendering, and other features.
We are continuing work on porting all existing GBrowse features to this new implementation, including the ability for users to upload their own data to a public server. We are also augmenting the current feature set by adding the ability to do community annotation. Users will be able to add, tag, modify, and otherwise collaborate on biological data in a public forum using this graphical interface, thus providing a “genome wiki.” The client will be aware of the features it is displaying and will be capable of querying multiple databases for rich feature information.
2 Software and files.
The demo of our implementation can be found at http://genome.biowiki.org. The source code is a part of the publicly available GMOD code base, downloadable from a Source Forge? repository (http://sourceforge.net/projects/gmod/, see the ajax subdirectory). This is a collaborative, open-source project that welcomes the comments, suggestions, and participation of everyone. Discussion and additional details can be found on our wiki page (http://biowiki.org/view/GBrowse/WebHome).
3 Figures and tables.
None.
4 References and bibliography.
[2] Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C.J., Osborne, B.I., Pocock, M.R., Schattner, P., Senger, M., Stein, L.D., Stupka, E., Wilkinson, M.D., Birney, E. 2002. The Bioperl toolkit: Perl modules for the life sciences. Genome Research 12: 1610-1618.
[1] Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., Lewis, S. 2002. The generic genome browser: a building block for a model organism system database. Genome Research 12: 1599-1610.
|