windoweater.pl
A layer between windowlicker.pl and xrate in our pipeline; throws out windows below some threshold of conservation.
Located in:
/nfs/projects/pipeline/perl/windoweater.pl
windoweater.pl is part of the pipeline CVS project;
see http://biowiki.org/HowToRunPipeline#Check_out_the_project
the Point
See also: How To Run Pipeline
We want to filter windows based on conservation.
After some discussion, we decided that a suitably generic way to implement this would be to:
- annotate each alignment segment with a conservation annotation (e.g.
#=GC CONS)
- have windowlicker.pl invoke a "fake xrate" process instead of the real xrate
- the "fake xrate" process (i.e. windoweater.pl) would now receive windows made by windowlicker.pl and make the decision whether they are conserved enough or not, then pass only the conserved ones to xrate, reducing runtime
- windows that aren't passed to xrate get rapidly annotated as "all intergenic" by windoweater.pl
- as a result, the windowlicker.pl GFF output will not include a
ncRNA annotation for non-conserved windows, as there will be no non-intergenic annotation in them
how to use
You can get a usage message (describing all the options) via:
./windoweater.pl --help
Alternately, use perldoc (looks prettier):
perldoc windoweater.pl
Here is an example of how to use the program in the Xrate Pipeline:
/nfs/src/dart/perl/windowlicker.pl
-w 100
-d 50
-gff segment.gff
-x /nfs/projects/pipeline/perl/windoweater.pl
segment.annot.stock
--
--eater_verbose
--eater_cutoff 5
--eater_mincols 0.65
--eater_xrate /nfs/src/dart/bin/xrate
-e /nfs/src/dart/grammars/jukescantor.eg
-g /nfs/projects/caf1screen/grammars/ncRnaDualStrand_v15.eg
Important points about the above example:
- we invoke windoweater.pl instead of xrate with the
-x option to windowlicker.pl
- the options after
-- normally all go to xrate; however, in this case, options starting with --eater_ will go to windoweater.pl, and the remaining options after -- will go to xrate
- the path to the xrate binary is now specified using the
--eater_xrate arg
- the numbers (i.e. window/step size, conservation cutoff) are all made up and probably aren't what you want in the real pipeline
The above is based on the segment.xrate rule in the Avu Tests CVS project.
-- Created by: AndrewUzilov on 18 Dec 2007 |