banner by Sarah Burns

How to Make Your Dragon

by Jesse Kaminsky

Part I:

Dragons do not exist. Despite being present in the mythology of many isolated cultures throughout history, there is no fossil evidence or evolutionary pointers that indicate their existence. There are also purely practical reasons the classical dragon could not exist. With our current biology, a creature of such size as the dragons of ancient folklore would require enormous wings to support their bodies in flight. Secondly, there are no known biological mechanisms within our current understanding that would produce that whole fire-breathing thing. Why are these processes impossible? Dragons (if they existed), like every other life form on earth, must be descended from a common ancestor. This ancestor first evolved the framework of which codon within our genes refers to which amino acid. Not only this, but it evolved the ability to encode only 20 amino acids into our proteins.

With these 20 amino acids as a root, Earth’s diverse tree of life grew forth. Unfortunately, evolution could not find a way to produce a creature that can breathe fire using just these 20 amino acids. Fire-breathing and flying reptiles may well have evolved had a larger pool of building blocks been available for manipulation. Now, however, we are at a point where we may soon discover a way to introduce diversity never before seen into Earth’s tree of life that may cause an influx of evolutionary activity. We are learning how to implement unnatural amino acids.

I’m not saying that just because we are working on adding new amino acids to the genetic code, we will be able to just create true flying, fire-breathing dragons. But imagine, given time, what kind of creatures could develop from evolution with the involvement of every possible amino acid. For those who still don’t believe a dragon could evolve from nature, just look at nature’s track record. Starting with just single celled organisms, reptiles, egg-laying mammals, and glowing, water breathing creatures have all developed from just 20 genetically encoded amino acids. Imagine what would have come about if there had been just one more amino acid.

To put this in perspective, consider the number of combinations of amino acids currently permissible by our genetic code. This number is simply 20ⁿ, where n represents how many amino acids comprise the protein. Let’s be cautious and consider a small protein with 75 amino acids in total. The number of combinations, which ideally corresponds to the number of proteins that can be formed, is 3.78x10⁹⁷. As a frame of reference, consider that there are roughly 7.5x10¹⁸ grains of sand on Earth, and this is only a small protein. If we add one unnatural amino acid to the ribosomal factory, we then calculate 21⁷⁵. This number is approximately 1.47x10⁹⁹; the number of combinations we can form from 20 amino acids is just 2% of those made from 21. From adding a single unnatural amino acid, even with a protein as small as 75 amino acids, we potentially gain a massive number of new proteins.

How exactly do we implement these unnatural amino acids? The key to adding new amino acids lies in tRNA molecules and aminoacyl tRNA synthetase (hereafter referred to as synthetase). The latter is the enzyme responsible for attaching a tRNA anticodon’s respective amino acid to tRNA molecules. Within the last ten to fifteen years, a rising field of research has uncovered an astounding evolutionary possibility. Researchers have developed a method for mutating naturally occurring synthetases to develop a cavity at the site of aminoacylation. This modification allows the addition of unnatural amino acids to similarly mutated tRNAs. In this way, researchers can assign new amino acids to preexisting codons, effectively adding a new letter to the language of proteins.

As with any radical leap forward in the field of biology, this is easier said than done. Codons are used throughout the body and already refer to existing amino acids. If we were to change their function, every other gene that contains the given codon could produce a different protein; the consequences would likely be lethal. Therefore, in order to prevent unintended side-effects, the new amino acid requires its own unique codon to correlate with. Another challenge facing researchers and dragon enthusiasts is the fact that there can be absolutely no interaction between the orthogonal (synthetic) pair and any of the heterologous (natural) pairs. The consequences of a new synthetase recognizing a natural tRNA or a new tRNA binding to a natural synthetase would likely be disastrous and widespread throughout the human body. It would be like if you were crying to form an English sencence and every other letcer ‘T’ was replaced by anocher letcer.

Many researchers have tried to come up with a solution to this difficulty. One lab paired a stop codon with the orthogonal tRNA and synthetase. As they were experimenting on the bacteria E. coli, they used the UAG stop codon since it is used far less often than any other codon throughout the genome. This type of codon normally signals the end of a protein’s synthesis. Normally you would expect cell death if gene translation was prolonged past its normal stopping point, but E. coli has the ability to survive the suppression of this particular codon due to its rarity albeit with reduced fitness. This study found that it is possible to safely and efficiently induce E. coli to incorporate various unnatural amino acids in a specifically chosen protein site.

Like in E. coli, it would theoretically be possible to reassign a non-stop codon without side effects if that organism had evolved to only use the codon where the researchers want to introduce the unnatural amino acid. Some studies have already shown it is possible to purposefully breed a bacterial strain with the elimination of specific codons as an evolutionary advantageous trait.

These attempts at the reassignment of preexisting codons to new unnatural amino acids would likely never work in any complex organism, let alone a human being, due to all the biological functions that our genome is a part of. To simplify the problem, other labs have taken even more creative approaches. Some successfully mutated their specific orthogonal pair to recognize a quadruplet codon (a codon that contains four nucleotides as opposed to the normal three) without any major phenotypical side effects. The difficulty in this approach is that the ribosome is not necessarily equipped to deal with tRNAs like this. It would also mean that anywhere else throughout the genome where these four nucleotides appear, the unnatural amino acid risks incorporation.

The foolproof way to prevent unintended unnatural amino acid addition is to add an unnatural nucleotide as well. This allows many more unique codon implementations and therefore the introduction of more than one unnatural amino acid at a time. It also allows extreme precision in where the amino acid is placed in the protein as the unnatural nucleotide would have to be added into the genome. Scientists could choose where to place the nucleotide, and therefore choose the one and only location that the tRNA’s anticodon would be utilized. Unfortunately, this means not only do we have to modify the entire framework surrounding the translation process, but also all the mechanisms involved in the transcription and reproduction process.

These strategies will be difficult to perfect, but the potential applications of the successful introduction of novel amino acids are more than worth it. Some synthetic amino acids allow a cell to create fluorescent proteins that provide a valuable tool in tracing important biological processes. One study modified a growth hormone to allow easier in vitro storage, which would benefit treatment for those with a deficiency in said hormone. This could be applied to any number of medications for bodily deficiencies. Unnatural amino acids also allow more accurate and consistent methods for chemical production.

Whether it is just the first step of the thousand-mile journey of evolution, or a tool to provide extensive and efficient treatment for both rare and common diseases, expanding the genetic code would prove an extremely rewarding challenge. Practically, the modification of our protein synthesis framework may never be fully applied to the human body or other complex life forms. However, it could still revitalize the pharmaceutical and biosynthetic industries through conceptually redefining our definition of life.

Part II:

AUG

The concept of translation has existed far longer than formal human languages. Whenever you see something, that image is turned into neural impulses. This is fundamentally a translation from visual stimulus to something your brain relays to your body to recognize. Of course the more traditional sense of the term, changing one language into another, is probably the most apt and simplistic application, but that doesn’t mean two languages are required for translation to be possible.

If we go back even further into the history of protein translation, the process of producing proteins from RNA, we reach a point where RNA started becoming amino acids. The biological code of equivalences was set in stone by evolution a long time ago and has barely changed since. To modify the genetic code, one would have to interfere with the act of translation itself. Adding unnatural amino acids is an example of this, one that would initiate an era of novel biodiversity and innovative pharmaceutical progress. Many labs have successfully used a variety of strategies for this process, all involving the introduction of mutant aminoacyl tRNA synthetase (hereafter referred to as “synthetase”). Ultimately, synthetase is the structure that dictates what amino acid pairs with each codon (the defining element of the genetic code) by connecting them through the middleman molecule tRNA.

Unfortunately, cells using extraneous, unnatural (i.e. orthogonal) synthetases cannot undergo evolutionary processes. If a lab develops and inserts an orthogonal synthetase and tRNA, the bacteria will start incorporating unnatural amino acids in places specified by the corresponding codon; this only affects the current organism and does not carry over to the next generation. Because the synthetases were merely given to the organism, there is nothing in their genome that will continue the process after reproduction. In other words, the genes of the current organism are being acted upon by the expansion and are not responsible for the expansion.

So what is responsible for the change? For a trait like an expanded genetic code to be heritable, the genes for the synthetase need to be modified. If one could specify what about the synthetase needs to be changed to allow space for a new amino acid in terms of its own amino acid sequence, the gene could be altered accordingly. If a lab could sequence an orthogonal synthetase and compare it to the original synthetase, they would be able to create an organism with the 21 amino acid genetic code trait (21-AGCT) directly wired into its genome.

One lab at the Scripps Research Institute has done this. They created a bacterium that suppresses the UAG stop codon, and instead, inserts the unnatural amino acid p-aminophenylalanine. They did so by identifying a synthetase that works with this amino acid, sequencing it and reverse engineering its gene. Additionally, the lab introduced genes that code for proteins capable of building the amino acid itself from preexisting parts. In this way, they successfully added the 21-AGCT trait to E coli genetically. If the bacteria were to reproduce, the offspring would also produce p-aminophenylalanine and incorporate it into every protein at any UAG codon in its corresponding mRNA.

Now evolution can finally step in and work its magic. Every gene in E. coli that contains a UAG codon will now be modified, providing fertile ground for mutation and natural selection to develop new traits; the equivalent changes in mammalian cells could potentially be as exotic as the creation of a fire-breathing dragon. However, there is one large flaw in the strategy of UAG suppression. If the presence of synthetase encoding genes is what allows an organism to express 21-AGCT, what would happen if the gene for the synthetase ended with a UAG stop codon?

Now imagine that this article isn’t just data, but also a set of instructions on how to interpret said data. Our synthetase UAG predicament is similar to using an instruction to read an instruction:

“Read any “.” as an O.” (The lab’s initial modification.)

You (the ribosome) would read (translate) this sentence (gene) and come away with the instruction (protein). You would then float around in the cell for a while reading other sentences using the new instruction. Then, if you were to come back to the sentence to read it again, you would keep in mind what you learned and instead see it as:

“Read any “O” as an OO” (The second translation.)

Once again you do as you’re told by the previous instruction and create a new instruction. And if you were to read it again:

“Read any “OO as an OOOO” (The third translation.)

Every time you read the sentence, it tells you to read it in a different way; every time the ribosome creates a new synthetase to expand the genetic code, the ribosome looks differently at the synthetase’s gene. What if this happened to result in a synthetase with a new amino acid specificity? As each synthetase produced wouldn’t disappear just because a new one appears, the cell would continuously expand its genetic code:

20-AGCT→21-AGCT→22-AGCT→23-AGCT→24-AGCT→…

(AGCT’s Genetically Continuing to Transition or AGCT²)

Thankfully, the Scripps lab’s synthetase gene did not end with a UAG codon, or it probably would have killed the organism. Replacing one amino acid with another likely wouldn’t have the miraculous result of a new amino acid specificity. That said, the number of mutations in the Scripps’ orthogonal synthetase relative to its natural cousin is only five, suggesting that the modification of only a few codons in a synthetase gene could successfully expand the genetic code. Careful genetic modification could also come in handy to resolve this dilemma. If we could figure out how an unnatural amino acid impacts a synthetase, we could craftily develop a genetic sequence that takes each modification into account. Another issue is that each orthogonal synthetase would be operating on the same tRNA (unless the tRNA gene ended with a UAG of course!). This means that every time that tRNA’s corresponding codon shows up, a number of different amino acids could result.

These are just two of the many obstacles that need to be overcome before AGCT² can begin, but the results would be worth it. As biology currently sits, organisms do not change their own genome within their lifespan and diversity is mostly a result of sexual reproduction and mutation. However, if AGCT² were to be taking place in an organism that could survive, it would allow genetic variance within a single generation without physically altering its genome. With millions of these organisms all in the same environment, many will produce genetic codes that won’t work, making it easier for organisms with a working expanded genetic code to reproduce. Unfortunately, while this allows for real time adaptation to environmental pressures through what seems to be genetic modification, in reality it won’t be inherited by the next generation. As I stated above, the genomes are not actually changing, what’s changing is the way the bacterial ribosomes look at the genome.

When the bacteria reproduce, the offspring will start at the very beginning of the cycle, with only the originally modified synthetase gene. This means real time monitoring would be required to identify advantageous traits. It would be especially interesting to watch several cultures of bacteria grow with different AGCT² starting points. Researchers could observe which ones survive and identify the synthetases responsible for the advantage. Scientists could then introduce them to a fresh organism using the Scripps method, making slight modifications to prevent or cycles, and produce an organism that can pass on an advantageous 21-AGCT.

This could be applied to any number of evolutionary pressures. Bacteria with AGCT² could be grown on plastic to see which develop proteins capable of metabolizing it. Similarly, bacteria with AGCT² could be grown on some virus or malicious bacteria, so that a strain with positive and preventative effects on the human body could be selected for. Any issue that can be framed as an evolutionary pressure can also potentially be solved by this strategy.

If the process is truly mastered, it could be applied to the human body as well. Specialized immune cells could be designed to use a number of different unnatural amino acids in combating infection. Specialized stomach cells could be designed to help mitigate the negative effects of eating particularly unpleasant foods like grass, plastic, or crayons. I hesitate to propose specialized nerve cells that allow the development of psychic powers using a variety of sophisticated AGCT’s (don’t get your hopes up, this is just the sci-fi lover in me dreaming). The Scripps lab has taken the first step and created a 21-AGCT life form that can survive unnatural amino acid incorporation. Translation is always happening, be it you reading this article or all the trillions of cells in your body producing proteins. Who knows, maybe one day, nature will produce a synthetase and tRNA without our involvement that suppresses UAG.