click on the Biowiki logo to go to homepage Home Home | EditEdit | Attach Attach | New New | Site Map Site Map | Help Help
Research | Teaching | Fall11
Biowiki > Fall11 > JacobVoganHomework8a


Advanced search...


-- JacobVogan - 24 Nov 2011

Homework #8a - Bayesian Analysis of DNA sequence Origin

  • ORF Bayesian Perl script - outputs posterior probability on sequence origin Features:

  • Reads FASTA formatted sequence file as input
  • Computes the probability distribution, P(x), for each nucleic acid "x" in sequence "S".
  • Outputs the posterior probability P(G=1|S) and the log ratio of P(G=1|S) / P(G=0|S)


  • perl file.fasta        # Outputs the posterior probability and log ratio of posterior probabilities
  • perl file.fasta -h        # Prints program information
  • perl file.fasta -move [#]        # Option to shift the reading frame (e.g. -move 1 will shift reading frame by 1 nucleotide)
  • perl file.fasta -stop [X]        # Option to enter unique STOP codon (e.g. TGG)


Background on sequence origin probabilities P(G=0) and P(G=1) can be found here.

This program first computes the posterior probability P(G=0|S) from the probability P(S|G=0). The prior probability P(S|G=0) is computed using the probability P(G=0) would not generate a STOP codon for the given sequence codon-length and reading frame (this also depends on P(x=T), P(x=A), and P(x=G)). From this, P(G=0|s) is computed by:


The posterior probability P(G=1|S) is then:


Terminal Example of

      For test.fasta file with sequences:


Terminal Example

I Attachment History Action Size DateSorted ascending Who Comment
Txttxt r4 r3 r2 r1 manage 7.0 K 2011-11-30 - 02:04 JacobVogan ORF Bayesian Perl script - outputs posterior probability on sequence origin
Actions: Edit | Attach | New | Ref-By | Printable view | Raw view | Normal view | See diffs | Help | More...