Saturday, July 18, 2009

Comparing Technologies

I heard that Wolfram Alpha was finally available, and I wanted to try it out. Wolfram Alpha is designed to be more than a search engine -- it's an answer engine. A search engine tries to find Web documents that contain information you want. But Wolfram Alpha will try to calculate an answer for you from data that it can access.

For example, if you want to know the "weight of the earth in pounds", it figures that (1) by "weight" you really meant mass, (2) the earth mass is available in a table of data about the planets of the solar system (although in metric units), (3) a table of conversion factors is available, and (4) a formula for converting units is available. Moreover, it has the 'smarts' to know that this is the data needed to get the answer, and it knows how to find and combine the details to get the answer.

Now, what problem would I use to try out this new answer engine? Well, I recall reading that DNA is an incredibly dense data storage and retrieval system, but I didn't have any number for the data density in, say, bytes per pound. So, I tried to get the number from Wolfram Alpha. But "DNA in pounds" was not precise enough. How much DNA? Just one 'base pair' (one unit of the chain), or an entire chromosome? And if a chromosome, which kind? (because they have different lengths)

DNA is a chain of information units called nucleotides. The chain is shaped like a twisted ladder, with each rung a pair of nucleotides that encodes two bits of information. There are four kinds of the nucleotides, so I began by asking for the mass of each kind, using their chemical names:

adenine mass in pounds: 4.9468*10-25 lb
guanine mass in pounds: 5.53252*10-25 lb
thymine mass in pounds: 4.51683*10-25 lb
cytosine mass in pounds: 4.06729*10-25 lb

I also needed the mass of the 'backbone' unit, for the 'sides' of the ladder:

deoxyribose mass in pounds: 4.45458*10-25 lb

Then, assuming that the four nucleotide types are used equally, I could now compute the data density of DNA:

1.084547*1024 bytes per pound
(That's about a one followed by 24 zeros.)

Now, what man-made data storage and retrieval system could I compare this to? I have an 8 GB thumb drive that weighs a quarter of an ounce, which may not be the most dense, but it's denser than a DVD or a hard drive. I calculated it's data density to be:

5.5*1011 bytes per pound

That means that DNA is about two trillion times more dense than the thumb drive. That is, the data capacity of a quarter of an ounce of DNA is equal to about two trillion 8 GB thumb drives! Engineers would love to be able to design a data storage and retrieval system with the density of DNA, but they don't know how.

Yet there are atheistic scientists that believe that mindless evolution accidentally created DNA millions of years ago. I have two reactions to this evolutionary belief:

First, as an engineer, I feel insulted that people actually think that a random process can out-do what none of my engineering colleagues can accomplish.

Second, it is clear to me that I don't have enough faith to be an atheist.

4 comments:

Anonymous said...

DNA of course has a very high data density, but comparing it to a USB drive is a bit of an apples-to-organges comparison. First, a USB flash device isn't the highest density storage device currently available. Also, most of the mass of a typical thumb drive is the housing - the actual flash memory is much ligher. The mass of pure DNA doesn't include any of the machinery needed to read it, or the mass of the containing cell, so the artificial memory storage device should be weighed only based on its storage components.

Also - DNA is essentially a read-only storage medium, and it isn't really random access (though it can be semi-randomly accessed). Flash of course is read-write.

Probably a better comparison would be between DNA and dynamic RAM or ROM memory (not sure which is higher-density in practice). Be sure to only include the mass of the silicon chip and not the containing package. I'm guessing that ROM would actually start to approach DNA in storage density. The ROM would probably also be much faster in terms of access times (both random and sequential).

Note also that DNA as-is cannot be used to store data with 100% reliability - if you allowed for a modest error rate I wouldn't be surprised if you could cram more ROM into a given space.

Even so, this is still an interesting exercise.

Mike said...

hello JC, I am really enjoying your blog (especially the biology lessons). I have studied the whole evolution vs. creation debate fairly well but I thought of something that I have never heard mentioned. How did a venom pumping stinger evolve on a bumble bee? I mean natural selection is only good for "breeding" lifeforms right? from my understanding not all the bees in a hive breed and even if they did the benefits of having a venom pumping detachable stinger would not be able to pass to the next generation (bee dies after stinging).
hopefully you can follow my reasoning (I don't have the best writing skills).
have you ever heard a evolutionist try to explain this?

thanks for reading,
mike d

JC said...

Thanks, Rich, for the additional analysis. I thought of most of the points you raised, but I didn't want to complicate the discussion. In this kind of estimation, the goal is to estimate the number of decimal places between the most-significant digit and the decinal point, not the actual digits of the ratio. If we account for the issues you raised, we might shift the decimal point by perhaps three places, in which the ratio is two billion instead of two trillion. Still, it would remain a huge difference.

DNA is read by 'randomly' (that is, in any order needed) reading information blocks called genes, via temporary copies in the form of messenger RNA (mRNA). Then ribosomes and transfer RNA read the mRNA sequentially. Likewise, flash memory is read in randomly selected large blocks that are temporarily copied to a block of RAM on the flash chip. Then the computer reads the RAM block (or part of it) sequentially. An interesting parallel, no?

It is interesting that DNA is read-only, copied only from other DNA. It prompts the question: Where did the information come from? As I point out elsewhere, the idea that information can come from randomness is nonsence. Attempts to argue for this idea confuse information with patterns.

JC said...

Izzy, your comments are encouraging. Others have also made the argument that it is hard for evolution to explain the development of suicidal behavior (Bees aren't the only ones). Sometimes the reaction by evolutionists and atheists is rambling profanity, as in the comments to this UTube video: http://www.youtube.com/watch?v=uv0CfdMeFnw . But in this link, the explantion offered is that the queen bee, which is the reproducer, survives: http://answers.yahoo.com/question/index?qid=20080611210938AAVbQAl .

I rhink that a more powerful argument is this: Evolution proposes that new features are acquired by incremental changes selected from random changes. And an improvement must be accomplished by each incremental change. But complex designs cannot be built up this way.

The stinger has many component parts, such as the venom chemistry, including its production, the pump, and the hollow dart; and none of these would provide any advantage without the other parts. (And the chemistry and pump are probably made of other basic parts.) Designs require planning (working toward a goal), but evolution is supposed to be a mindless series of accidents.