Skip to main content

The LHC as a massive grid computer

From one angle the Large Hadron Collider is a particle collider; but from another, it's a massive grid computer with the collider as its CPU, according to a rich and highly readable overview posted by Tim O'Brien on the O'Reilly Media Web site:

When the LHC is turned on, it will be more than just a 27-km wide particle accelerator buried 100m deep in Geneva colliding protons. When the LHC is running it will be colliding millions of protons per second, and all of this data will need to be captured and processed by a world-wide grid of computing resources. Correction, by a massively awe-inspiring and mind-numbingly vast array of computing resources grouped into various tiers so large that it is measured in units like Petabytes and MSI2K.

What's a MSI2K?

"Mega SPECint 2000". SPECint 2000 is a standard measure of the power of a CPU. For an in depth explanation see Wikipedia. If we assume a 2 x 3.0 GHz Xeon CPU is 2.3 KSI2K, then it would take about 430 of those CPUs to equal 1 MSI2K. 4.6 MSI2K is going to involve thousands of CPUs dedicated to data extraction and analysis.

Surprisingly, the main analysis package used for the collider at CERN, the European particle physics lab, is 10 years old and freely available under an open-source license, O'Brien says. He digs into the analysis software and the multiple tiers of computing, scattered among more than 100 data centers around the world, that will process about two Gigabytes of data every 10 seconds from the LHC.

Special bonus:  O'Brien interviews Brian Cox, a physicist at the University of Manchester and one of the most articulate explainers of the LHC, its physics and, in this case, its computing; download the audio and read a transcript here.  Cox's talk at the March 2008 TED conference is also worth a look.