Home - this site is powered by TWiki(R)
JBrowse > QuickTutorial
TWiki webs: Main | TWiki | Sandbox   Log In or Register

Changes | Index | Search | Go

Quick JBrowse tutorial

This quick guide on how to set up JBrowse should no doubt be regarded as experimental, etc.

install pre-requisites

  • BioPerl, JSON (e.g. from CPAN command-line client)
    • or skip to next step and get an Amazon EC2 machine image

Installing Bioperl on an Apple OS X machine can be problematic; you may find it necessary to do a force install from CPAN, if the regular installation fails repeatedly.

download the source

Do one of the following:

The public git clone syntax is

git clone git://github.com/jbrowse/jbrowse.git

To do git clone, you need to install the Wikipedia:Git_software.

After git clone (or tar -xvzf), enter the top-level jbrowse directory

cd jbrowse

Everything from now on assumes you are sitting in the root directory of the cloned (or otherwise downloaded) repository.

compile binaries

To build the C++ parts (only WIG tracks currently require this, for the wig2png program), type

make

On an Apple OS X machine, you may need to edit the Makefile to add libpng and png.h to your compiler's library & include paths.
  • e.g. make GCC_LIB_ARGS=-L/opt/local/lib GCC_INC_ARGS=-I/opt/local/include
    • or somewhere like /usr/X11R6
    • Try locate libpng

three easy steps

Prepare (i) reference sequences, (ii) tracks, (iii) name index.

See workflow diagram for overview.

stick yer annotations here

run some scripts

We will now walk through each preparation step individually.

In what follows, a number of Perl scripts in the bin/ directory are referred to (e.g. bin/prepare-refseqs.pl).

Running any of these scripts without any options should display a short help message.

sequences

First you have to break the reference sequences, specified in your FASTA file, into chunks.

tracks

Next you have to prepare your annotation tracks.

These can include co-ordinates of genes and other finite elements, specified in GFF files...

...or they can include quantitative tracks (histograms), specified in WIG files.

A variety of command-line arguments to these two scripts can be used to change the appearance and layout of tracks.

The scripts can be run repeatedly to add multiple tracks from different data sources.

names

Your feature co-ordinate tracks will be navigable by feature name, through a text box.

After generating the tracks, you have to build a name index:

On linux machines with SELinux enabled (Fedora, recent CentOS and Red Hat) you need to mark the files generated by generate-names.pl as servable by apache. To see if SELinux is enabled, run the "getenforce" command. If it replies "Enforcing", then you need to run an additional command after generate-names.pl:

chcon -R -t httpd_sys_content_t names/

If getenforce replies with something else, or if there's no getenforce command on your machine, then you don't need to worry about it.

point yr browser here

We assume Apache can see where you are, yes?

For (rather fragile) reference/debugging purposes, here are the files you should expect to be able to access

Downloaded with git repository

  • index.html
  • genome.css
  • js/
  • jslib/

Generated by preparation scripts (& JsonGenerator , NCList , LazyPatricia, ... packages in lib/)

  • data/
    • refSeqs.js
    • trackInfo.js
    • tracks/
    • tiles/
  • names/
    • root.json

(Fragile, because this stuff may change as the code matures. But potentially useful for alpha-test troubleshooting)

additional sections

the makefile

An example Makefile is provided (as part of the TWiki plugin) which uses suffix rules to implement the workflow shown in the diagram more-or-less exactly, with a few typical GNU make kludges (e.g. symlinking files into temporary directories, touching empty targets, etc.)

Please note that the Makefile is provided to illustrate script usage, and possibly to help write plugins (such as the TWiki plugin); it's NOT intended as a supported serverside interface to the preparation scripts.

To run all preparation scripts on default (basic) settings, assuming your annotation files all have the correct filename suffices and reside in the current directory, just type:

make jbrowse -f twiki/JBrowsePlugin/Makefile.jbrowse

This runs all three preparation steps for you (sequences, tracks & feature names).

The default Makefile rules recognize .fasta, .gff, .bed, and/or .wig as valid (case-sensitive) filename suffices and associated file types.

The effective "annotation data source" (c.f. this diagram) is the current working directory.

The Makefile also recognizes an optional track configuration file, config.js.

In theory you should now be able to point your browser at this directory and start browsing anotations.

In practice you will typically want to hand-walk the process a little more than this.

file formats & suffix rules

The following file types are automatically recognized by the Makefile, via the filename suffix:

Filename suffix File format Contents
.fa, .fasta FASTA Sequence data
.gff GFF Gene, feature co-ordinates
.bed BED Simple feature co-ordinates
.wig WIG Quantitative data

example "genome wiki" using makefile

This is a simple "JBrowse plugin" for the TWiki wiki management system. It runs the JBrowse Makefile on uploaded data files whenever a user attaches a file to a wiki page.

build from database

The power tool of JBrowse scripts:

A more flexible alternative to flatfile-to-json.pl.

the config file

The JBrowse config file, read by biodb-to-json.pl, controls the rendering of feature tracks from a single data source.

The data source can be Chado, or it can be another Bioperl-compatible database -- including, for example, a simple GFF file.

The full documentation for the config file is at docs/config.html in the git repository.

using config files with make

It's possible to use snippets of JBrowse config files with the included Makefile, to control generation of feature tracks from GFF files in the current directory, but this functionality is currently limited.

If a config.js file is present in the current working directory, containing a list of tracks to make from a GFF file, the Makefile will run biodb-to-json.pl on every GFF file instead of flatfile-to-json.pl. NB the config.js file should include only the track descriptions from the full JBrowse config file; that is, where the JBrowse config file has this...

{
  "description": "blah",
  "db_adaptor": "blah blah,
  "db_args": { ... },
  "TRACK DEFAULTS": { ... },

  "tracks" : [ X, Y, Z ]
}

...the config.js file should have only this...

X, Y, Z

The rest is autogenerated by the Makefile.

It's probably just as easy to write a complete JBrowse config file (see above) and run biodb-to-json.pl directly.

building from chado

more to follow here, we hope... or add your own comments on serving JBrowse from Chado...

example: flybase

WRITE ME!

workflow diagram

Boxes indicate scripts; circles are data files (inputs & outputs). The outputs of the key preparation steps are bold ovals.

Dashed lines show alternate/optional steps.

biodb-to-json can replace flatfile-to-json for a richer set of configuration options (e.g. working directly from a Bioperl database).

This workflow is more-or-less exactly implemented by the example Makefile and the TWikiPlugin.

digraph G {

subgraph InputSide { InputDatabase -> FASTA; InputDatabase -> BED [style=dashed]; InputDatabase -> GFF; InputDatabase -> WIG; }

subgraph ClientSide { RefSeqs -> Client; TrackInfo -> Client; PatriciaTrie -> Client; }

subgraph Makefile { FASTA -> PrepareRefseqs -> RefSeqs; PrepareRefseqs -> TrackInfo; BED -> FlatfileToJson [style=dashed]; GFF -> FlatfileToJson -> TrackInfo; WIG -> WigToJson -> TrackInfo;

InputDatabase -> PrepareRefseqs [style=dashed]; InputDatabase -> BioDBToJson [style=dashed]; Config -> PrepareRefseqs [style=dashed]; Config -> BioDBToJson [style=dashed]; GFF -> BioDBToJson [style=dashed]; BioDBToJson -> TrackInfo [style=dashed]; RefSeqs -> BioDBToJson [style=dashed]; BioDBToJson -> FeatureNames [style=dashed];

RefSeqs -> FlatfileToJson; RefSeqs -> WigToJson; FlatfileToJson -> FeatureNames; FeatureNames -> GenerateNames -> PatriciaTrie; }

FASTA [label="FASTA file"]; GFF [label="GFF file(s)"]; BED [style=dashed,label="BED file(s)"]; WIG [label="WIG file(s)"]; Config [style=dashed,label="JBrowse config file"];

PrepareRefseqs [shape=box,label="prepare-refseqs.pl"]; BioDBToJson [shape=box,style=dashed,label="biodb-to-json.pl"]; FlatfileToJson [shape=box,label="flatfile-to-json.pl"]; WigToJson [shape=box,label="wig-to-json.pl"]; GenerateNames [shape=box,label="generate-names.pl"];

RefSeqs [style=bold,label="Reference sequence info"]; TrackInfo [style=bold,label="Track info"]; FeatureNames [label="Feature names"]; PatriciaTrie [style=bold,label="Name index"];

InputDatabase [shape=diamond,label="Annotation data source"]; Client [shape=diamond,label="Javascript client"];

label="JBrowse server workflow"; }

Like winter's bare branch
functions and statements sloughed off
tree-structured json

-- ChrisMungall, MitchSkinner, IanHolmes (c) EvolutionarySoftwareFoundation - 20 Mar 2009

Edit | Attach | Print version | History: r229 < r228 < r227 < r226 < r225 | Backlinks | Raw View | Raw edit | More topic actions

This site is powered by the TWiki collaboration platformCopyright © 2008-2013 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
TWiki Appliance - Powered by TurnKey Linux