Saturday, February 09, 2008

Multiprocessing, or Distributed Computing

When I worked for ITT, sometimes I investigated new methods of sending military radio signals to improve resistance to enemy interference. To evaluate these methods without needing to build new radios, I simulated the communication process with software. It took a huge amount of computer time to try many different design values and to average the results of many experiments.

To get more experimental results faster, I decided to take advantage of the facts that (1) the company had hundreds of computers connected by a local area network, and that (2) most of the time, these computers were either idle or working at only a fraction of full capacity. So I devised a system whereby the idle time of most of these computers could be used to work on my research project.

I organized my software so that the work could be done by small tasks needing from half an hour to two hours of computer time each; and so that the results of these tasks could be consolidated to complete the work. One part of the system doled out these tasks to computers that volunteered to do work. Another part collected, checked, and reported the results, and determined how much work each volunteer computer did.

To promote the project and solicit volunteers, I sent out emails and provided a web page. The emails explained how easy it was to volunteer (just one click), and how each volunteer could enable and disable the contribution of his computer whenever he wanted. The web page displayed project progress and volunteer contributions, and provided answers to frequently-asked questions. Since origami is one of my hobbies, I offered an origami prize for the biggest contributor.

I ran several projects this way, and usually got about 100 volunteers, with about 60 to 80 computers working at one time. A project generally ran for three to four weeks. So each project was about four or five years of computing for one computer.

Most volunteers let their computer run the project overnight, and some would volunteer other computers when colleagues left on vacation or a business trip. The 'multiprocessing' (as I called it) had to shut down every Friday night, however, because that was when the entire computer network was shut down for data backup. I fixed it so that the multiprocessing would automatically restart on Saturday mornings.

I created my 'multiprocessing' system in 1998 and used it for a few years. Another division of ITT that wanted to do something similar for evaluation of weather satellite data processing asked my advice to set up their system.

But now this kind of computing is done world-wide on the Internet on a much larger scale, and it's now called 'distributed computing'. United Devices established their distributed computing system in 2001. University of California, Berkeley launched BOINC (Berkeley Open Infrastructure for Network Computing) in 2003. BOINC has over 540,000 active computers worldwide working on hundreds of projects.

Now my computer runs BOINC in its spare time, supporting four projects:

PrimeGrid: prime number research

Cosmology at Home: cosmological research

Rosetta at Home: protein folding research

World Community Grid: drug research

Most BOINC projects are non-profit, and there is no monetary compensation to those who volunteer their computer's run-time. But all BOINC projects issue credits as a kind of thank-you that is proportional to the amount of work contributed. And there are web sites that provide statistics and graphics that summarize a contributor's credits and ranking, like this:

No comments: