Home - this site is powered by TWiki(R)
Fall11 > TWikiUsers > ConorMcclune > ConorMcCluneHW4
TWiki webs: Main | TWiki | Sandbox   Log In or Register

Changes | Index | Search | Go
-- %TEACHINGWEB%.ConorMcclune - 16 Oct 2011

RNA folding Homework

1. Hammerhead ribozyme Yes gate:

sequence: GGGCGACCCUGAUGAGCUUGAGUUUAGCUCGUCACUGUCCAGGUUCAAUCAGGCGAAACGGUGAAAGCCGUAGGUUGCCC

OFF position:

I opened figure 2 from the Breaker & Penchovsky paper and typed the sequence of their YES gate into a file, which I named hammerYes. I then ran RNAfold with following command

$ RNAfold -p hammerYesStruct

This inputted the sequence from hammerYes into RNAfold and outputted the dot bracket notation of the MFE structure and its free energy. The MFE structure and energy written to hammerYesStruct is :

((((((((((((((((((((.(..(((.......))).)))))))).))))).....(((((....))))).)))))))) (-35.10)

RNAfold also outputs a postscript (.ps) file visualizing the folding structure. The command -p instructs the program to also output a dot plot in .ps format. You can covert both post script files to PDFs with the command:

$ convert name.ps name.pdf.

These pdfs can then be visualized in Adobe Reader (command: $acroread name.pdf) and compared to the structure and plot of the OFF YES gate in the Breaker & Penchovsky paper. Both the plots and structure predictions match (see attached files 1-HammerYesOff-dot.pdf and 1-HammerYesOff-struct.pdf). In the off state, the cleavage site, along with stems I and III are available, but the OBS interacts with nucleotides required to form stem II. Thus, the active hammerhead ribozyme structure does not form.

ON position:

In order to predict the structure of the ON hammerhead YES switch, we need to force the oligonucleotide binding sequence (OBS) to be unavailable to binding with the rest of the structure, because it will be tightly bound to its complimentary oligonucleotide. The RNAfold program has an easy feature to accommodate this. By typing -C after RNAfold you can introduce certain constraints. I opened the sequence file and replaced the 22 nucleotides in the OBS with x's, indicating to RNAfold -C that these will not be involved in any interactions.

hammerYesOn: GGGCGACCCUGAUGAGCUUGAGUUUxxxxxxxxxxxxxxxxxxxxxxAUCAGGCGAAACGGUGAAAGCCGUAGGUUGCCC

I then ran the following command:

command
$ RNAfold -p -C hammerYesStructOn

Once again, I extracted the MFE structure and energy from the output file (hammerYesStructOn):

((((((((.......((((((...........................))))))...(((((....))))).)))))))) (-28.53)

The program once again outputted postscript files for the RNA structure and the dot plot (because I had added the -p modifier). Again, I converted these files to PDFs to view them in Adobe reader (see attached files 1-HammerYesON-dot.pdf and 1-HammerYesON-struct.pdf). The structure and plot once again perfectly matched those in figure 2 of the paper. In this case, all three stems formed, with the cleavage site available, meaning the ribozyme would cleave.

An interesting note is that the differing stem structures in the 35-55nt region are each barely visible on the other McCaskill plot. This means that when the complimentary oligonucleotide is absent and the gate is (mostly) in the OFF structure, it will still spend a very small fraction of its time in the ON position (and vice versa). This ultimately allows the complimentary oligo to bind when it becomes present, because the OBS is occasionally exposed.

_2. Software for verifying Yes Gate _

This program would initially create two sequence variables: $offseq is the inputted RNA sequence and $onseq is the same sequence with the OBS nucletides replaced by x's. $onseq could be constructed easily using the OBS coordinates given as the second input to create two substrings of the inputted sequence (before and after the OBS) and then concatenate a string of x's between them.

The next step would be to use RNAfold to generate dot-bracket MFE predictions for the sequence. This could be done with a file handle, as demonstrated in the RNA lab:

open RNAFOLDOFF, "| RNAfold > Temp_off";
print RNAFOLDOFF $offseq ;
close RNAFOLDOFF;
open RNAFOLDON, "| RNAfold -C > Temp_on";
#note the -C command here was used so the x's are counted as unpaired nucleotides print RNAFOLDON $onseq ;
close RNAFOLDON;

These two temporary files would consist of the sequence in the first line, and the dot-bracket structure and energy of the MFE on the second line. Now we want to look for correctly folded hammerhead structures. For each of the temporary files, use a counter to get to line 2, then run a subroutine to scan for the correct secondary ribozyme structure (explanation below) (also, please ignore bullet points - they are the easiest way to indent the line):

sub find_ribostruct {

  • $dbstruct = $_[0]; #dot-bracket structure is inputted
  • $hammerA = "((((((((.......(((((("; #first part of hammerhead structure
  • $hammerB="))))))...(((((....))))).))))))))"; #second part of hammerhead structure
  • if ( $dbstruct =~ /$hammerA (\S+) $hammerB/ and brack_count($1) ) {
    • $start = index $dbstruct, $1;
    • @index = ($start, $start+length($1));
  • }
  • else {
    • $dbstruct = reverse ($dbstruct);
    • $dbstruct =~ tr/()/)(/;
    • if ( $dbstruct =~ /$hammerA (\S+) $hammerB/ and brack_count($1) ) {
      • $start = index $dbstruct, $1;
      • @index = ($start, $start+length($1));
    • }
    • else {
      • @index = ("no","ribozyme")
    • }
  • return @index;
  • }
}

This subroutine uses reg ex to search for the proper two pieces of the don-bracket sequence of a hammerhead ribozyme (. To do this it must verify that the dot/bracket sequence between $hammerA and $hammerB has an equal number of open and close brackets, for which you would have another subroutine (brack_count) that returns 1 or 0 if it does or does not. If it does find the correct ribozyme structure sequences, bounding a region of equal "(" and ")", then it generates an index array containing the start and end location of the sequence between the two ribozyme sequences (This is the region we do not care about, so long as it does not disrupt the ribozyme structure. The OBS is probably somewhere here). If we fail to find the ribozyme structure, I reverse the dot/bracket sequence and try again. If it still fails, then the index array returned is merely ("no","ribozyme").

After generating an index array for both the ON sequence and OFF sequence, these indexes can be used to check the nucleotide sequences against the known sequence of the ribozyme. Even if it has an identical structure to the hammerhead ribozyme, it will not be funcitonal unless it also has the correct nucleotides.

Now the program knows whether the ON and OFF forms of this sequence contain functional ribozymes, and a logic operation can be outputted. If both the proper hammerhead structure and sequence were found in the OFF sequence, $offRibo will be assigned 1. Otherwise, it will be assigned 0. The same will be done for $onRibo. The outputted logic will merely be "0 -> $offRibo \n 1 -> $onRibo". For a system similar to the Breaker & Penchovsky Yes gates, the logic operator will be:

0 -> 0
1 -> 1

However, this method of checking will also allow you to see NOT gates (output = opposite of input), constitutive ON gates and constitutive OFF gates.

3. Hammerhead ribozyme structure

I copied the following Hammerhead ribozyme sequence from Genbank:AF404053.1 (nucleotides 70-186) into a text file:

ctttccctgaagagacgaagtgatcaagagatcgaagacgagtgaactaattttttttaataaaaagttcaccacgactcctccttctctcacaagtcgaaactcagagtcggcaag

I then passed this file through the RNAfold program with the following command:

$ RNAfold -p < hammerhead.txt > hammerheadStruct.txt

The outputted MFE structure and energy is as follows:

...........((((.((((.((((....)))).....((.(((((((...((((......)))))))))))..)).......))))))))....(((((.........)))))... (-24.60)

I converted the post script files of the structures and dot plots to PDFs (see attached files 3-hammerhead - dot.pdf and 3-hammerhead - struct.pdf).

However, when compared to the secondary structures stored on the Hammerhead ribozyme (type I) RFAM database, one will notice the RNAfold prediction does not match the documented structure. This may be the result of a sequence that has multiple structures with similar energies. The dot plot shows that there are many more midrange dots than we have seen in previous examples, especially near the center of the sequence.

It also may be possible that a subset of our sequence will produce an MFE structure closer to the RFAM structure. The sequence we are analyzing is much longer than the sequences in the RFAM structures, and these extra nucleotides may have favorable interactions with nucleotides that should instead form the Hammerhead ribozyme structure.

I Attachment Action Size Date Who Comment
Pdfpdf 1-HammerYesON-dot.pdf manage 8.0 K 2011-10-16 - 04:36 ConorMcclune  
Pdfpdf 1-HammerYesON-struct.pdf manage 7.1 K 2011-10-16 - 04:36 ConorMcclune  
Pdfpdf 1-HammerYesOff-dot.pdf manage 13.1 K 2011-10-16 - 04:36 ConorMcclune  
Pdfpdf 1-HammerYesOff-struct.pdf manage 7.2 K 2011-10-16 - 04:36 ConorMcclune  
Pdfpdf 3-hammerhead-dot.pdf manage 20.4 K 2011-10-16 - 06:59 ConorMcclune  
Pdfpdf 3-hammerhead-struct.pdf manage 7.9 K 2011-10-16 - 06:59 ConorMcclune  
Edit | Attach | Print version | History: r24 < r23 < r22 < r21 < r20 | Backlinks | Raw View | Raw edit | More topic actions

This site is powered by the TWiki collaboration platformCopyright © 2008-2013 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
TWiki Appliance - Powered by TurnKey Linux