StockholmFormat

StockholmFormat is a flatfile format for databases of annotated multiple sequence alignments.

It is the format used e.g. by the Pfam and Rfam databases, containing alignments of protein and RNA families, respectively.

Erik Sonnhammer's group's page has the format spec (AlexCoventry - 28 Feb 2005).

Here is our mirror of that page (IanHolmes).

Stockholm shows by-column alignment annotations, such as RNA secondary structure, in a compact and (if appropriately indented) human-readable way. For example (pairwise alignment of purine riboswitches):

# STOCKHOLM 1.0
#=GC SS_cons       .................<<<<<<<<...<<<<<<<........>>>>>>>..
AP001509.1         UUAAUCGAGCUCAACACUCUUCGUAUAUCCUC-UCAAUAUGG-GAUGAGGGU
#=GR AP001509.1 SS -----------------<<<<<<<<---..<<-<<-------->>->>..--
AE007476.1         AAAAUUGAAUAUCGUUUUACUUGUUUAU-GUCGUGAAU-UGG-CACGA-CGU
#=GR AE007476.1 SS -----------------<<<<<<<<-----<<.<<-------->>.>>----

#=GC SS_cons       ......<<<<<<<.......>>>>>>>..>>>>>>>>...............
AP001509.1         CUCUAC-AGGUA-CCGUAAA-UACCUAGCUACGAAAAGAAUGCAGUUAAUGU
#=GR AP001509.1 SS -------<<<<<--------->>>>>--->>>>>>>>---------------
AE007476.1         UUCUACAAGGUG-CCGG-AA-CACCUAACAAUAAGUAAGUCAGCAGUGAGAU
#=GR AE007476.1 SS ------.<<<<<--------->>>>>.-->>>>>>>>---------------
//

Stockholm allows sequences to be split over multiple lines (as in the above example), though this is "discouraged" in the spec.

StockholmTools

See StockholmTools for a list of tools for working with StockholmFormat.

Also see the Bioperl Stockholm class.

Topic revision: r37 - 2011-08-16 - IanHolmes
 

This site is powered by the TWiki collaboration platformCopyright © 2008-2013 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
TWiki Appliance - Powered by TurnKey Linux