Biology Of Genomes2007 Non Coding RNAPoster
The submitted abstract:
PREDICTION AND EVOLUTIONARY ANALYSIS OF NONCODING RNA Andrew Uzilov, Ian Holmes Department of Bioengineering, University of California, Berkeley
Noncoding RNA genes and regulatory elements (ncRNAs) constitute a significant fraction of eukaryotic genomes. The secondary structure of these features is often conserved throughout evolution. Alignments of ncRNAs typically display a characteristic pattern of nested, covariant substitutions, corresponding to conserved stem helices.
We have predicted thousands of putative conserved ncRNAs in D.melanogaster by analysis of multiple genome alignments of twelve Drosophila species, scanning for covariation in these alignments. We compare these predictions to previously published genome-wide surveys of noncoding DNA transcription and intra-species population genetics that suggest widespread conservation of noncoding DNA in Drosophila.
We complement these genome-wide studies with focused evolutionary analysis of particular ncRNA alignments, starting with those whose function is well-studied. Measurements of the evolutionary substitution rate in loop vs stem regions of these ncRNA alignments highlights those ncRNAs with stem-located motifs complementary to trans-targets, such as microRNAs: the stem regions in these genes evolve slowly relative to other ncRNAs. Within individual ncRNA families, it is also possible to identify loops and stems that contain protein-binding sites, based on their slower evolutionary rate. We can also detect changes in selection pressure throughout the phylogenetic tree. These results extend popular molecular evolutionary tools for protein-coding sequences, such as ratios of synonymous/nonsynonymous rates, to noncoding RNA.
Our sequence analysis uses a new bioinformatics tool called XRATE that implements exact or close equivalents of all previously described phylo-HMM and phylo-grammar models (including e.g. PFOLD, Exoniphy, DLESS, Phast Cons and Evo Fold). This is done within a single framework, facilitating consistency of annotation for exons, ncRNAs and other conserved features. Computational biologists can prototype new phylo-grammars by editing a model specification file; the model can be immediately trained on fresh data without the need for coding. The trained phylo-grammar can be used to annotate genome alignments, or to investigate evolutionary rates in annotated alignments. The software is open source and freely available at http://biowiki.org/dart.
XRATE has also been used to scan vertebrate genomes for recently nonfunctional pseudogenes, measure evolutionary rates of ancient repeats in ENCODE regions and probabilistically reconstruct genotypes ancestral to present-day sequences.
-- Created by: Andrew Uzilov on 17 Apr 2007