Saturday, March 06, 2010

More About the Genetic Code

Sometimes I will go back to one of my blog articles to correct minor errors; and a few times I have made major additions. But a disadvantage of this is that people that have already read the original article will probably not go back to read it again.

A little more than a year ago, I wrote The Genetic Code - how to read the DNA record, and recently added some details to a paragraph and expanded the conclusion of the article. So here is the amended paragraph and the expanded conclusion.

The original article gave the impression that only the transfer RNA (tRNA) molecules define the genetic code. Actually, other, larger, molecules are also involved. So the amended paragraph clarifies this:

The key elements of translation are small transfer RNA (tRNA) molecules. Each kind of tRNA molecule has a region called the anticodon that can recognize and attach to a particular codon of a messenger RNA (mRNA) molecule. The tRNA molecule has another region called the "3' terminal" that attaches to a particular amino acid. This attachment is aided by molecules called aminoacyl-tRNA synthetases, of which there is generally one kind for each kind of amino acid. There are even helper molecules that provide a proofreading function to detect and correct any translation errors.

(Actually, there are some variations of this, but discussing these would be distracting. There are also many other types of complex molecules that control the code-translation process but do not define the genetic code -- another subject.)

Then I expanded the conclusion:

Where does the genetic code come from? It is not the result of chemistry or any laws of physics. It is determined by the set of tRNA molecule types, and aminoacyl-tRNA synthetase types, which are constructed according to DNA information, which encodes not only the building materials and the building plans, but also the building tools and the building methods. In other words, the genetic code is just information that has always been there since life began.

The number of possible genetic codes is a huge number, 85 digits long:

1,510,109,515,792,918,244,116,781,339,315,785,081,841,294, 607,960,614,956,302,330,123,544,242,628,820,336,640,000

and all of these many codes would work equally well. But all of life uses just one genetic code, about 280 bits of information, the one that scientists Watson and Crick discovered in 1953, but was there since creation. The theory of evolution has no explanation for how the genetic code began, because it can't explain how information can arise from no information. Nor can it explain why there is only one genetic code (out of such a huge number of equally workable codes), even though there is extreme variation of everything else. The mechanism of the present genetic code is very complex; and evolutionary theory supposes that it randomly evolved from a simpler, smaller code. But because there are so many equally viable genetic codes, random evolution should have produced species with many different codes. The evolutionary explanation is far more unlikely than dumping a bucketful of dice on the floor and expecting them to all land with the same number up.

The creationist explanation is that the universal genetic code is like a signature of the creator, who chose a uniform code for all of the designs of life. A short story will illustrate the principle:

During the Cold War, Russia was suspected of stealing American technology. Proof came when some Russian war equipment given to a third country was captured and examined. It contained an integrated circuit that was identical to an American design. It is theoretically possible that the Russians had the same design concept, leading to a similar design. But digital circuits have thousands of component parts connected by thousands of wires. There trillions of ways to position the parts on the chip and trillions of ways to route the connecting wires that work equally well. It would be impossible for the Russians to independently produce the same positions and routings even if the logical design were identical. But examination showed the details were identical, even details left over from correcting wiring errors. In effect, there was an American 'signature' in the copied design.

For the Math fans, I'll add a footnote on how that 85-digit number was calculated:

That big number counts the number of ways that the 64 codons can be mapped to 21 interpretations, or interpreted as 21 'messages'. One message is to start with a Methionine (or add a Methionine if already started); one is to stop, and the other 19 messages are to add one of the other 19 amino acids [to the peptide chain that will fold into a protein molecule]. This 64-to-21 mapping can be enumerated in two steps:

First, we count the number of partitions of a set of 64 items into 21 non-empty, pair-wise disjoint subsets. In plain language, this means that:
  • Together, the 21 subsets must contain all of the 64 codons.
  • Each codon must be assigned to only one subset.
  • None of the subsets can be empty; each must contain at least one codon.
This count is calculated by a mathematical function called the Sterling number of the second kind, which is S(64, 21) in this case.

Second, we need to count the number of ways that the 21 subsets can be mapped to the 21 messages. This the number of permutations of 21 things, which is 21 factorial, written 21!

So the desired number is S(64, 21) times 21! But typical computer hardware cannot directly compute numbers that large. Special software that partitions a big number into slices small enough for the hardware is needed. When I was designing special hardware for very large integers (for public key cryptography; I have two patents, #4,658,094 and #5,289,397, for that), I wrote such software so that I could test and verify my designs. So I used my 'BigInt' software to do the arithmetic.

1 comment:

JC said...

To verify that huge number, you can go to the site and enter the formula:
S2(64, 21) times 21!