Home - this site is powered by TWiki(R)
Fall08 > DavidTFinalProject
TWiki webs: Main | TWiki | Sandbox   Log In or Register

Changes | Index | Search | Go

Final Project for David Tulga

Introduction

      The prospect of computationally designable genetic circuits and logic gates is a goal that many fields are attempting to address. However, many of the synthetic systems and genetic circuits used today suffer from unpredictability and are difficult to design, while most devices and parts can only be used once in a cellular system. As such, creating cellular logic devices that are predictable and orthogonal could have a broad impact on the fields of Synthetic Biology, Computational Biology, and Genetics. A promising methodology for designing biological logic gates is by utilizing allosteric RNA molecules, as their structures are predictable, they can be constructed to be orthogonal, and provide a high dynamic range. Hammerhead ribozymes are a particularly useful motif as these self-cleaving RNA molecules generally show >1,000 fold activation, and can bind specific nucleotide sequences. [1] As well, through the use of computational methods such as the partition function and structure prediction algorithms implemented in RNAfold, they can be reliably designed and their functionality can be determined. [2] In support of this goal, I have implemented a computational RNA logic gate generator that generates a variety of different candidates for YES gates. This type of logic gate is useful in signal transduction, for a DNA input to RNA or other molecular output, as well as in signal amplification, as one DNA effector molecule could potentially trigger many ribozyme self-cleavage reactions. This computational methodology is also useful as a reduction to practice system as it exemplifies many common RNA design principles, and could be readily adapted to generate more complex gates and systems.

Results

      My application generates YES gate candidates with a random OBS binding region of 16-22nt. Each logic gate it designs takes as an input a DNA effector molecule that has a complementary sequence to the OBS sequence of the gate. These YES gates have a truth table of: when the effector is absent the gate is inactive, and when the effector is present the gate is active. When the Effector DNA is absent, the gate is in the OFF state and adopts a conformation where stems I, II, and IV are formed, and stem III is absent, as seen in Figure 1. This prevents the hammerhead ribozyme’s self-cleavage reaction, as the necessary catalytic nucleotides are sequestered inside of stem IV. Upon binding of the DNA effector, the hammerhead ribozyme adopts a conformation where stems I, II, and III form, and the catalytic nucleotides are exposed, as seen in Figure 2. The gate is now in the ON state, and will then proceed to undergo self-cleavage. Based on the Penchovsky and Breaker paper's experimental results, the gates would be expected to autocatalyze at a similar rate in the ON state, and maintain a stable OFF state over many hours. These gates are also expected to be very sequence-specific, as the results described by Penchovsky and Breaker indicated that as few as 2 mismatches would dramatically reduce the autocatalysis rate.

Figure 1. OFF State of the Example Gate

Figure 2. ON State of the Example Gate

Analysis

      To analyze the resulting RNA logic gates generated from my program, I utilized a combination of RNA analysis software from the Vienna RNA package, as well as performed information content analysis of the OBS sequence, both with WebLogo [4,5] and my Mutual Information Content Homework program. First, I analyzed the gates' thermodynamic stability in the OFF state, both with RNAFold at different temperatures, and through RNAheat. The gates were found in general to be stable throughout the 20-40°C range, and did not melt until much higher temperatures, as seen in Figure 3. As well, on average I found the candidate OBS sequences had a probability distribution of 24.8% A, 23.2% U, 25.6% G, and 26.4% C, with a total GC content of 52.0%. This distribution was observed with uniform sampling probabilities of the four nucleotides and indicates a slight GC bias in the candidate OBS sequences, likely due to the high GC content of the hammerhead motif, which is 56.9%. The OBS sequences also had an average length of about 19 nt, with the 18nt length being relatively uncommon. I then performed a multiple sequence alignment of 35 candidate sequences, and ran this through both my mutual information homework application, as well as WebLogo. The mutual information analysis indicated which columns of the OBS generally base-paired with each other, and these tended to cluster close together, when those parts of the OBS are near the end of stem IV, or far apart when the beginning and end of the OBS base-paired with each other. As well, the WebLogo analysis displayed the information content of each base for each position, indicating whether a particular base was favored at any given location. WebLogo indicated some higher-than usual significance for C nucleotides and less likely G nucleotides at positions 6 and 24, indicating both favorable base-pairs as well as specifically unfavorable nucleotides, as seen in Figure 4.

Figure 3. RNAheat Specific Heat plot of the Example Gate

Figure 4. WebLogo Information Content Analysis of a Multiple Alignment of 35 Candidate Sequences

      Overall, my results matched the results from the Penchovsky and Breaker paper, [3] with some noticeable differences. The gates generated by my application do seem to be on average more stable than the gates tested in the Penchovsky and Breaker paper, and this may be due to the ensemble diversity cutoff in my application. It is set to 9 units as described in their methodology, but some of their gates have ensemble diversities exceeding 17. [3] As well, my application was found to be computationally efficient, and was primarily limited by the speed at which RNAfold could be initialized. Re-implementing the RNAfold source code would likely dramatically increase the speed of my application. As well, I attempted to implement a method by which multiple orthogonal logic gates could be designed with the same secondary structure and properties, but found that the RNAinverse algorithm could not generate a large enough sample of sequences to find valid candidates. Re-implementing the RNAinverse algorithm or a similar method of computational evolution of the candidate logic gates would likely be necessary for this goal to succeed. Finally, the structure prediction software I used does not take into account three-dimensional folding, and so some structures may have unfavorable conformations that were not predicted.

Discussion

      The ability to computationally design orthogonal biological logic gates and predict their functionality could be useful in constructing cellular and in vitro logic systems, performing genetic analysis, and treating disease. These gates would be useful for logic systems in vitro as they are easier to manipulate and synthesize than comparable protein logic gates and do not require chaperons to fold. These gates also have the property of requiring constant input to maintain an output signal. This could be useful in preventing constitutive expression, as well as building a second layer of control into the circuit by modulating how many logic gates are produced. They could provide a means to detect specific RNA or DNA sequences rapidly, as well as could be utilized in genetic analysis, possibly by generating RNAi molecules. [6] These gates would also be useful in a variety of synthetic cellular systems as they could provide many orthogonal pairs that would be used simultaneously. [7] In addition, these gates would be useful for many therapeutic applications, such as treating abnormal RNA sequences. [8] As well, these gates could be used in gene therapy to engineer T-cells that are immune to HIV. In general, this computational methodology combined with further experimental analysis could help advance the fields of RNA structure prediction and enable the design of more complex systems.

Methods

      My application performs its computational RNA logic gate design by utilizing a similar approach to the methodology described in the Penchovsky paper on the computational design of allosteric ribozyme logic gates. [3] My application also allows the customization of the OBS sequence GC% content, as well as the number of candidate sequences to generate. First, it generates a random nucleotide sequence for the OBS region, of length 16-22. It then inserts this sequence into the predefined hammerhead ribozyme motif of: GGGCGACCCUGAUGAGCUUGAGUUU(X)16-22AUCAGGCGAAACGGUGAAAGCCGUAGGUUGCCC. This candidate sequence is then folded through RNAfold which calculates the partition function, minimum folding energy structure, and thermodynamic stability. This folding is done both in the OFF state, where the OBS effector DNA is not bound, as well as in the ON state, where the DNA effector is bound.

      Next, this candidate is analyzed to determine its effectiveness as a YES gate. It first verifies that the catalytic nucleotides are bound in a helix when the gate is in the OFF state, and that stem IV, not stem III, has formed. Then, it ensures that in the ON state all three stems I, II, and III have formed, when the DNA effector has bound. It then analyzes the structure to ensure that the OFF structure is not too stable or too weak. The ideal stability is achieved when between 30% and 70% of the OBS nucleotides participate in base-pairing in the OFF state. It then examines the ensemble diversity of the OFF state, and ensures that it is less than or equal to 9 units, as a higher value can indicate high secondary structure variability and gate instability. It then compares the free energy between the OFF and ON states, as an approximation of the thermodynamic cost involved in switching between the two states. This helps ensure a stable OFF state, and a fast transition to an ON state. The ideal gap is when the difference in free energy is within -6 and -10 kcal/mol. It then verifies stability throughout the range of temperatures from 20 to 40°C, to ensure correct operation at 23°C and 37°C, and verify that the gate will not begin to melt as the temperature is raised. This analysis is done by calculating the RNA molecule’s specific heat with the program RNAheat. If a candidate satisfies all these requirements, it is registered and saved for further experimental analysis.

Citations

  1. G.A. Soukup and R.R. Breaker. “Allosteric nucleic acid catalysts.” Curr. Opin. Struct. Biol. 10 (2000), pp. 318–325
  2. McCaskill, J. (1990) “The equilibrium partition function and base pair binding probabilities for RNA secondary structure.” Biopolymers, 29, 1105–1119
  3. Penchovsky R, Breaker RR. “Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes.” Nat Biotechnol. 2005 Nov;23(11):1424-33. Epub 2005 Oct 23.
  4. Crooks GE, Hon G, Chandonia JM, Brenner SE. "WebLogo: A sequence logo generator" Genome Research, 14:1188-1190, (2004)
  5. Schneider TD, Stephens RM. 1990. "Sequence Logos: A New Way to Display Consensus Sequences." Nucleic Acids Res. 18:6097-6100
  6. Hean J and Weinberg MS (2008). "The Hammerhead Ribozyme Revisited: New Biological Insights for the Development of Therapeutic Agents and for Reverse Genomics Applications", RNA and the Regulation of Gene Expression: A Hidden Layer of Complexity. Caister Academic Press. ISBN 978-1-904455-25-7.
  7. Maung Nyan Win and Christina D. Smolke. “Higher-Order Cellular Information Processing with Synthetic RNA Devices” Science 17 October 2008: Vol. 322. no. 5900, pp. 456 - 460 DOI: 10.1126/science.1160311
  8. Citti L, Rainaldi G. (2005). "Synthetic hammerhead ribozymes as therapeutic tools to control disease genes" Current Gene Therapy 5 (1): 11-24. PMID 15638708.

Supplementary Information

The application is available as davidtulga_final_gate_constructor.pl.

Two sample candidate files, each with 10 candidates generated, are given as candidates.txt and candidates2.txt.

The example gate demonstrated is candidate 7 in the first sample candidate file.

The multiple alignment of the 35 candidate sequences is given as gate_alignment.stock.

Also, the mutual information analysis results are given as gate_information_analysis.txt.

 

ISorted ascending Attachment Action Size Date Who Comment
Stockstock gate_alignment.stock manage 1.7 K 2008-12-17 - 06:54 DavidT Multiple Alignment of 35 Candidate Sequences
Txttxt candidates.txt manage 4.8 K 2008-12-16 - 19:16 DavidT Sample Candidate File 1
Txttxt candidates2.txt manage 4.8 K 2008-12-16 - 19:17 DavidT Sample Candidate File 2
Txttxt davidtulga_final_gate_constructor.pl.txt manage 4.8 K 2008-12-16 - 19:21 DavidT Gate Constructor Application
Txttxt gate_information_analysis.txt manage 2.0 K 2008-12-17 - 06:55 DavidT Mutual Information Analysis Results
Edit | Attach | Print version | History: r23 < r22 < r21 < r20 < r19 | Backlinks | Raw View | Raw edit | More topic actions

This site is powered by the TWiki collaboration platformCopyright © 2008-2013 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
TWiki Appliance - Powered by TurnKey Linux