Stockholm Format is a flatfile format for databases of annotated multiple sequence alignments.
It is the format used e.g. by
the Pfam
and Rfam
databases, containing alignments of protein and RNA families,
respectively.
Erik Sonnhammer's group's page has the format spec
(Alex Coventry - 28 Feb 2005).
Here is our mirror of that page (Ian Holmes).
Stockholm now has a wikipedia page:
Stockholm shows by-column alignment annotations, such as RNA secondary structure, in a compact
and (if appropriately indented) human-readable way.
For example (pairwise alignment of purine riboswitches):
# STOCKHOLM 1.0
#=GC SS_cons .................<<<<<<<<...<<<<<<<........>>>>>>>..
AP001509.1 UUAAUCGAGCUCAACACUCUUCGUAUAUCCUC-UCAAUAUGG-GAUGAGGGU
#=GR AP001509.1 SS -----------------<<<<<<<<---..<<-<<-------->>->>..--
AE007476.1 AAAAUUGAAUAUCGUUUUACUUGUUUAU-GUCGUGAAU-UGG-CACGA-CGU
#=GR AE007476.1 SS -----------------<<<<<<<<-----<<.<<-------->>.>>----
#=GC SS_cons ......<<<<<<<.......>>>>>>>..>>>>>>>>...............
AP001509.1 CUCUAC-AGGUA-CCGUAAA-UACCUAGCUACGAAAAGAAUGCAGUUAAUGU
#=GR AP001509.1 SS -------<<<<<--------->>>>>--->>>>>>>>---------------
AE007476.1 UUCUACAAGGUG-CCGG-AA-CACCUAACAAUAAGUAAGUCAGCAGUGAGAU
#=GR AE007476.1 SS ------.<<<<<--------->>>>>.-->>>>>>>>---------------
//
Stockholm allows sequences to be split over multiple lines (as in the above example),
though this is "discouraged" in the spec.
See Stockholm Tools for a list of tools for working with Stockholm Format.
Also see the Bioperl Stockholm class. |