Calculating the Universe

February 1, 2012 | 9:42 am

This story appeared today in isgtw.

This image shows over a million luminous galaxies at redshifts indicating times when the universe was between seven and eleven billion years old, from which the sample in the current studies was selected. Image by David Kirkby of the University of California at Irvine and the SDSS collaboration.

Since 2000, the three Sloan Digital Sky Surveys (SDSS I, II, and III) have surveyed well over a quarter of the night sky, producing the biggest 3-D color map of the Universe ever made. Now, scientists have used this visual information for the most accurate computation yet of how matter clumped together – from a time when the universe was only half its present age until now.

“The way galaxies cluster together over vast expanses of the sky tells us how both ordinary visible matter and underlying invisible dark matter are distributed, across space and back in time,” said Shirley Ho, an astrophysicist at Lawrence Berkeley National Laboratory and Carnegie Mellon University who led the work. “The distribution gives us cosmic rulers to measure how the universe has expanded, and a basis for calculating what’s in it: how much dark matter, how much dark energy, even the mass of the hard-to-see neutrinos it contains. What’s left over is the ordinary matter and energy we’re familiar with.”

For the present study, Ho and her colleagues first selected 900,000 luminous galaxies from among over 1.5 million such galaxies gathered by the Baryon Oscillation Spectrographic Survey, or BOSS, the largest component of the still-ongoing SDSS III. Most of these are ancient red galaxies, which contain only red stars because all their faster-burning stars are long gone, and which are exceptionally bright and visible at great distances. The galaxies chosen for this study populate the largest volume of space ever used for galaxy clustering measurements. Their brightness was measured in five different colors, allowing the redshift of each to be estimated.

“By covering such a large area of sky and working at such large distances, these measurements are able to probe the clustering of galaxies on incredibly vast scales, giving us unprecedented constraints on the expansion history, contents, and evolution of the universe,” said Berkeley Lab’s Martin White, chair of the BOSS science survey teams. “The clustering we’re now measuring on the largest scales also contains vital information about the origin of the structure we see in our maps, all the way back to the epoch of inflation, and it helps us to constrain – or rule out – models of the very early universe.”

After augmenting their study with information from other data sets, the team derived a number of such cosmological constraints (measurements of the universe’s contents based on different cosmological models). Among the results: in the most widely accepted model, the researchers found – to less than two percent uncertainty – that dark energy accounts for 73 % of the density of the universe.

The team’s results are presented 11 January at the annual meeting of the American Astronomical Society in Austin, Texas, and have been submitted to the Astrophysical Journal. They are currently available online at http://arxiv.org/abs/1201.2137.

Read on at isgtw.org.

- Paul Preuss

Guest author

No Comments »

The Tevatron’s enduring computing legacy

January 18, 2012 | 3:53 pm

This story appeared Dec. 21 in iSGTW.

"The Great Wall" of 8mm tape drives at the Tagged Photon Laboratory, circa 1990 - from the days before tape robots. Photo by Reidar Hahn, Fermilab.

This is the first part of a two-part series on the contribution Tevatron-related computing has made to the world of computing. This part begins in 1981, when the Tevatron was under construction, and brings us up to recent times. The second part will focus on the most recent years, and look ahead to future analysis.

Few laypeople think of computing innovation in connection with the Tevatron particle accelerator, which shut down earlier this year. Mention of the Tevatron inspires images of majestic machinery, or thoughts of immense energies and groundbreaking physics research, not circuit boards, hardware, networks, and software.

Yet over the course of more than three decades of planning and operation, a tremendous amount of computing innovation was necessary to keep the data flowing and physics results coming. In fact, computing continues to do its work. Although the proton and antiproton beams no longer brighten the Tevatron’s tunnel, physicists expect to be using computing to continue analyzing a vast quantity of collected data for several years to come.

When all that data is analyzed, when all the physics results are published, the Tevatron will leave behind an enduring legacy. Not just a physics legacy, but also a computing legacy.

In the beginning: The fixed-target experiments

1981. The first Indiana Jones movie is released. Ronald Reagan is the U.S. President. Prince Charles makes Diana a Princess. And the first personal computers are introduced by IBM, setting the stage for a burst of computing innovation.

Meanwhile, at the Fermi National Accelerator Laboratory in Batavia, Illinois, the Tevatron has been under development for two years. And in 1982, the Advanced Computer Program formed to confront key particle physics computing problems. ACP tried something new in high performance computing: building custom systems using commercial components, which were rapidly dropping in price thanks to the introduction of personal computers. For a fraction of the cost, the resulting 100-node system doubled the processing power of Fermilab’s contemporary mainframe-style supercomputers.

“The use of farms of parallel computers based upon commercially available processors is largely an invention of the ACP,” said Mark Fischler, a Fermilab researcher who was part of the ACP. “This is an innovation which laid the philosophical foundation for the rise of high throughput computing, which is an industry standard in our field.”

The Tevatron fixed-target program, in which protons were accelerated to record-setting speeds before striking a stationary target, launched in 1983 with five separate experiments. When ACP’s system went online in 1986, the experiments were able to rapidly work through an accumulated three years of data in a fraction of that time.

Entering the collider era: Protons and antiprotons and run one

1985. NSFNET (National Science Foundation Network), one of the precursors to the modern Internet, is launched. And the Tevatron’s CDF detector sees its first proton-antiproton collisions, although the Tevatron’s official collider run one won’t begin until 1992.

The experiment’s central computing architecture filtered incoming data by running Fortran-77 algorithms on ACP’s 32-bit processors. But for run one, they needed more powerful computing systems.

By that time, commercial workstation prices had dropped so low that networking them together was simply more cost-effective than a new ACP system. ACP had one more major contribution to make, however: the Cooperative Processes Software.

CPS divided a computational task into a set of processes and distributed them across a processor farm – a collection of networked workstations. Although the term “high throughput computing” was not coined until 1996, CPS fits the HTC mold. As with modern HTC, farms using CPS are not supercomputer replacements. They are designed to be cost-effective platforms for solving specific compute-intensive problems in which each byte of data read requires 500-2000 machine instructions.

CPS went into production-level use at Fermilab in 1989; by 1992 it was being used by nine Fermilab experiments as well as a number of other groups worldwide.

1992 was also the year that the Tevatron’s second detector experiment, DZero, saw its first collisions. DZero launched with 50 traditional compute nodes running in parallel, connected to the detector electronics; the nodes executed filtering software written in Fortran, E-Pascal, and C.

The high-tech tape robot used today. Photo by Reidar Hahn, Fermilab.

Gearing up for run two

1990. CERN’s Tim Berners-Lee launches the first publicly accessible World Wide Web server using his URL and HTML standards. One year later, Linus Torvalds releases Linux to several Usenet newsgroups. And both DZero and CDF begin planning for the Tevatron’s collider run two.

Between the end of collider run one in 1996 and the beginning of run two in 2001, the accelerator and detectors were scheduled for substantial upgrades. Physicists anticipated more particle collisions at higher energies, and multiple interactions that were difficult to analyze and untangle. That translated into managing and storing 20 times the data from run one, and a growing need for computing resources for data analysis.

Enter the Run Two Computing Project (R2CP), in which representatives from both experiments collaborated with Fermilab’s Computing Division to find common solutions in areas ranging from visualization and physics analysis software to data access and storage management.

R2CP officially launched in 1996. It was the early days of the dot com era. eBay had existed for a year, and Google was still under development. IBM’s Deep Blue defeated chess master Garry Kasparov. And Linux was well-established as a reliable open-source operating system. The stage is set for experiments to get wired and start transferring their irreplaceable data to storage via Ethernet.

“It was a big leap of faith that it could be done over the network rather than putting tapes in a car and driving them from one location to another on the site,” said Stephen Wolbers, head of the scientific computing facilities in Fermilab’s computing sector. He added ruefully, “It seems obvious now.”

The R2CP’s philosophy was to use commercial technologies wherever possible. In the realm of data storage and management, however, none of the existing commercial software met their needs. To fill the gap, teams within the R2CP created Enstore and the Sequential Access Model (SAM, which later stood for Sequential Access through Meta-data). Enstore interfaces with the data tapes stored in automated tape robots, while SAM provides distributed data access and flexible dataset history and management.

By the time the Tevatron’s run two began in 2001, DZero was using both Enstore and SAM, and by 2003, CDF was also up and running on both systems.

Linux comes into play

The R2CP’s PC Farm Project targeted the issue of computing power for data analysis. Between 1997 and 1998, the project team successfully ported CPS and CDF’s analysis software to Linux. To take the next step and deploy the system more widely for CDF, however, they needed their own version of Red Hat Enterprise Linux. Fermi Linux was born, offering improved security and a customized installer; CDF migrated to the PC Farm model in 1998.

Fermi Linux enjoyed limited adoption outside of Fermilab, until 2003, when Red Hat Enterprise Linux ceased to be free. The Fermi Linux team rebuilt Red Hat Enterprise Linux into the prototype of Scientific Linux, and formed partnerships with colleagues at CERN in Geneva, Switzerland, as well as a number of other institutions; Scientific Linux was designed for site customizations, so that in supporting it they also supported Scientific Linux Fermi and Scientific Linux CERN.

Today, Scientific Linux is ranked 16th among open source operating systems; the latest version was downloaded over 3.5 million times in the first month following its release. It is used at government laboratories, universities, and even corporations all over the world.

“When we started Scientific Linux, we didn’t anticipate such widespread success,” said Connie Sieh, a Fermilab researcher and one of the leads on the Scientific Linux project. “We’re proud, though, that our work allows researchers across so many fields of study to keep on doing their science.”

A wide-angle view of the modern Grid Computing Center at Fermilab. Today, the GCC provides computing to the Tevatron experiments as well as the Open Science Grid and the Worldwide Large Hadron Collider Computing Grid. Photo by Reidar Hahn, Fermilab.

Grid computing takes over

As both CDF and DZero datasets grew, so did the need for computing power. Dedicated computing farms reconstructed data, and users analyzed it using separate computing systems.

“As we moved into run two, people realized that we just couldn’t scale the system up to larger sizes,” Wolbers said. “We realized that there was really an opportunity here to use the same computer farms that we were using for reconstructing data, for user analysis.”

Today, the concept of opportunistic computing is closely linked to grid computing. But in 1996 the term “grid computing” had yet to be coined. The Condor Project had been developing tools for opportunistic computing since 1988. In 1998, the first Globus Toolkit was released. Experimental grid infrastructures were popping up everywhere, and in 2003, Fermilab researchers, led by DZero, partnered with the US Particle Physics Data Grid, the UK’s GridPP, CDF, the Condor team, the Globus team, and others to create the Job and Information Management system, JIM. Combining JIM with SAM resulted in a grid-enabled version of SAM: SAMgrid.

“A pioneering idea of SAMGrid was to use the Condor Match-Making service as a decision making broker for routing of jobs, a concept that was later adopted by other grids,” said Fermilab-based DZero scientist Adam Lyon. “This is an example of the DZero experiment contributing to the development of the core Grid technologies.”

By April 2003, the SAMGrid prototype was running on six clusters across two continents, setting the stage for the transition to the Open Science Grid in 2006.

From the Tevatron to the LHC – and beyond

Throughout run two, researchers continued to improve the computing infrastructure for both experiments. A number of computing innovations emerged before the run ended in September 2011. Among these was CDF’s GlideCAF, a system that used the Condor glide-in system and Generic Connection Brokering to provide an avenue through which CDF could submit jobs to the Open Science Grid. GlideCAF served as the starting point for the subsequent development of a more generic glidein Work Management System. Today glideinWMS is used by a wide variety of research projects across diverse research disciplines.

Another notable contribution was the Frontier system, which was originally designed by CDF to distribute data from central databases to numerous clients around the world. Frontier is optimized for applications where there are large numbers of widely distributed clients that read the same data at about the same time. Today, Frontier is used by CMS and ATLAS at the LHC.

“By the time the Tevatron shut down, DZero was processing collision events in near real-time and CDF was not far behind,” said Patricia McBride, the head of scientific programs in Fermilab’s computing sector. “We’ve come a long way; a few decades ago the fixed-target experiments would wait months before they could conduct the most basic data analysis.”

One of the key outcomes of computing at the Tevatron was the expertise developed at Fermilab over the years. Today, the Fermilab computing sector has become a worldwide leader in scientific computing for particle physics, astrophysics, and other related fields. Some of the field’s top experts worked on computing for the Tevatron. Some of those experts have moved on to work elsewhere, while others remain at Fermilab where work continues on Tevatron data analysis, a variety of Fermilab experiments, and of course the LHC.

The accomplishments of the many contributors to Tevatron-related computing are noteworthy. But there is a larger picture here.

“Whether in the form of concepts, or software, over the years the Tevatron has exerted an undeniable influence on the field of scientific computing,” said Ruth Pordes, Fermilab’s head of grids and outreach. “We’re very proud of the computing legacy we’ve left behind for the broader world of science.”

Miriam Boon

No Comments »

Happy 10th Birthday, WLCG!

December 21, 2011 | 4:00 pm

Computing center, photo courtesy of CERN.

This story appeared today in iSGTW.

Amid all the hype and excitement of the new physics being announced from experiments at the Large Hadron Collider in 2011, there was another, little known, cause for celebration: the anniversary of the Worldwide LHC Computing Grid (WLCG).

It was 10 years ago, in September 2001, that the huge computing grid was conceived of and approved by the CERN council, in order to handle the large volumes of data expected by the LHC. By March 2002, a plan of action had formed.

And, now that the LHC is up and running, “the biggest achievement,” said Ian Bird, the head of the WLCG at CERN, in Geneva, Switzerland, “is that is works so well and so early in the life of LHC.”

Data pours out of each of the four detectors at a ripping pace – the ATLAS detector alone produces about one petabyte per second (or 1,000,000 GB per second), and a farm of processors pares back the data, filtering out the majority of it, until 300 MB per second is chosen to be stored on the grid. One copy of the data is kept at CERN (Tier 0), while another copy of the data is transferred and shared between the 11 major computing centers (Tier 1).

“The most amazing thing is that we can actually handle this kind of data,” said Bird. “Data rates today are much higher than anything we ever planned for during a normal year of data taking.”

As well as the sheer volume of data, the WLCG has also faced the unique challenges of computationally intensive simulations, and the fact that the 8,000 or so physicists involved in the projects must be able to access the data from their home institutions around the world.

“The grid is pretty much the only way that the masses of data produced by the collider can be processed. Without it, the LHC would be an elaborate performance art project on the French-Swiss border,” wrote Geoff Brumfield in a Nature blog, following his valiant attempt to follow a single piece of data through the grid (“Down the Petabyte Highway” published January 2011).

While the WLCG computing grid is successfully handling the data today, 10 years ago, while preparations for the Large Hadron Collider were well underway, there was a hole in the funding bucket. The computing resources required to handle the avalanche of LHC data had been left behind as preparations were made for the collider.
A hole in the funding bucket

“Computing wasn’t included in the original costs of the LHC,” Les Robertson, who was the head of the computing grid from 2002 – 2008, told iSGTW in 2008.

This decision left a big hole in funding for IT crucial to the ultimate success of the LHC.“We clearly required computing,” said Robertson, “but the original idea was that it could be handled by other people.”

But by 2001, these “other people” had not stepped forward. “There was no funding at CERN or elsewhere,” Robertson said. “A single organization could never find the money to do it.”

“Early on, it became evident that, for various reasons, placing all of the computing and storage power at CERN to satisfy all the [computing] needs would not be possible. First, the infrastructure of the CERN computing facility could not scale to the required level without significant investments in a power and cooling infrastructure many times larger than what was available at the time,” Ian Bird said.

And in 2001, “CERN’s dramatic advance in computing capacity is urgent,” the press release read.
A dramatic advance in computing capacity

There were two phases to the WLCG. From 2002 to 2005, staff at CERN and collaborating institutes around the world developed prototypes, which would eventually be incorporated into the system. Then, from 2006, the LHC Computing Grid formally became a worldwide collaboration (the Worldwide LCG – WLCG), and computing centers around the world were connected to CERN to help store data and provide computing power. Throughout its lifetime, the WLCG has worked closely with large-scale grid projects such as EGEE (Enabling Grids for E-sciencE), and more recently EGI (European Grid Initiative), funded by the European Commission, and OSG (Open Science Grid) funded the National Science Foundation in the USA. Today, EGI and OSG not only support high energy physics, but a variety of other science experiments and simulations as well.

Using the grid for real computation began as early 2003, with the experiments using it to run simulations. And since 2004 a series of data and service challenges were performed (see timeline, below), to test things such as reliability of data transfers.

“The grid’s performance during the first two years of LHC running has been impressive and has enabled very rapid production of physics results,” Bird said. Data flow and hybrid clouds The first model of distributed computing, proposed in 1999 and called the MONARC model, was the model on which all the experiments originally based their own computing model. But this model was much more complicated than it has to be today, according to Bird. This complexity was added because it was thought that the weakest link in the chain would be the networks linking together all the computing centers and allowing for fast, reliable data transfer. Today, computing models are more likely to see extensive data flows between Tier 2 and Tier 3 centres.

As well as a new model for data flow, other future challenges include the use of multicore and other CPU types, the replacement of certain components of grid middleware with more standard software and the use of virtualization. And the use of cloud computing is a matter of “when, not if” Bird said.

“The LHC computing environment will outlive the accelerator itself, but it will evolve along with technology and is likely to become very different over the next few years,” Bird said.

– Jaqui Hayes

Guest author

No Comments »

SLAC physicists using physics simulation tool to make cancer therapy safer

October 24, 2011 | 4:45 pm

This story first ran in SLAC Today on Oct. 20, 2011.

Joseph Perl described how he and his colleagues are turning the simulation toolkit Geant4 into a powerful application for medical physicists. Image by Helen Shen.

Tiny particles are making a big difference in the world of cancer therapy. And SLAC physicists—experts in particle transport—are using computer simulations to make those therapies safer.

At the Oct. 10 SLAC Colloquium, the lab’s own Joseph Perl described how he and his colleagues are turning the simulation toolkit Geant4 into a powerful application for medical physicists. Originally designed to track subatomic particles in high-energy physics experiments, Geant4 can also map proton paths through patients’ bodies during radiation treatment.

In radiation treatment, subatomic particles inflict DNA damage on dividing cells—both healthy and cancerous—causing them to commit suicide. The technique works because rapidly growing cancer cells are more likely to be dividing at any given time, and thus are more likely to be killed; but a smaller proportion of healthy cells are also susceptible to damage.

Minimizing collateral damage is a tough problem for medical physicists who design radiation treatments.

“To perfect this stuff, what we have to understand really is where are the particles going?” Perl said. “We have to understand particle transport when we’re designing the medical linacs,” the accelerators that deliver the particles to patients. “We have to understand particle transport when we’re talking about how the beams actually penetrate the body.”

Computer simulation tools such as EGS4, developed at SLAC in the 1970s, have helped medical physicists predict the behavior of X-rays and gamma rays. Now, Geant4 offers the ability to model proton beams, too.While Perl is SLAC’s only Geant4 team member working on medical physics applications, he works with four other partners on more general Geant4 capabilities. Together they constitute the second-largest team in the international Geant4 collaboration.

In contrast to the X-ray beams used in traditional therapies, which go all the way through the body, proton beams dump their energy at a specific depth. Medical physicists can target a tumor at one depth and avoid deeper and shallower tissues by tweaking the energy levels of one or more beams. Proton beam therapy may be particularly useful in children, for whom stray radiation can stunt growth and cause secondary cancers in adulthood.

Geant 4 relies on a technique called Monte-Carlo simulation, which models each proton moving through the body in a series of random steps. At each step, the program essentially casts a die to guess where the particle will move next. Over many steps, the program shows where protons are most likely to end up.

Unlike many other tools, Geant4 can also simulate the effect of tissues, such as the rib bones, that may move in and out of the proton beam as a patient breathes. Such obstructions can block some radiation from the intended target, while simultaneously allowing some tissue to soak up unnecessary radiation. Geant4 can potentially help medical physicists program beams to track a moving target and deliver a constant dosage to the tumor.

Geant4 is freely available to anyone who wants to use it, but in its current form may be challenging to some novices. “It’s a really fancy techno-Lego kit,” said Perl, but the box does not come with any ready-made toys.

To address this problem, SLAC’s Geant4 team has joined Massachusetts General Hospital and the University of California-San Francisco, in a four-year collaboration funded by the National Institutes of Health. The project, headed by Perl, will help medical physicists customize their simulations without disrupting the program’s innermost workings.

“If we can make it easier for people to use,” Perl said, “the more likely they are to use things right.”

- Helen Shen

Symmetry Intern

No Comments »

Developers create virtual CERN

September 19, 2011 | 9:45 am

Neng Xu sits in the cafeteria, where he was inspired to create a virtual CERN. Photo credit: Amy Dusto

Neng Xu, a software engineer for the University of Wisconsin-Madison working on the ATLAS experiment, sat drinking coffee in a sunny corner of CERN’s cafeteria when he thought of a challenge. Could he create a virtual version of what he saw out the window: a lawn with cafe tables and a building across the street?

He could, he found, and more. With the help of a colleague, Xu is now growing his virtual CERN into an interactive app that will work across multiple platforms and is slated for a public beta version sometime this fall. Fans of digital art got a preview of the tool at this month’s Ars Electronica festival.

For years Xu has been a video game enthusiast. He’d toyed with the idea of making his own but ultimately decided, “Whatever I do, I can’t make better games than game companies.” Then he realized he could use the same type of software to do something different.

Xu used a free online game development tool called Unity to begin construction of the virtual cafeteria lawn and the building across the street, where in reality many physicists working on the ATLAS and CMS detectors have offices. Completing the project took two months of after-hours work. He’d included so much detail that the final file was huge, about 200 megabytes, and took nearly an hour to open.

When he moved on to his next idea, making a virtual model of the ATLAS detector, Xu cut down on the amount of information he included. That project took only half of a month to complete. After finishing it, Xu planted the virtual detector on the virtual lawn in front of the cafeteria to give a sense of the experiment’s enormity.

The ATLAS detector, parked in an usual spot. Image courtesy of Neng Xu

The resources coordinator for ATLAS saw the model. He asked Xu to make something similar that people outside of CERN could access online. Xu agreed. Although he was still working on this project beyond his day job, ATLAS began to provide him with some support. Besides software supplies, perhaps the most valuable asset Xu received was his introduction to fellow modeling innovator Joao Pequenao.

For the last decade, Pequenao has been working on multimedia, images and simulations for the ATLAS outreach office. While an undergraduate physics student in Lisbon, Portugal, he passionately pursued a hobby in graphic design. As the autodidact said, “Some kids play soccer, some kids go home to study [graphics] tools.”

Over the years, Pequenao’s work has gained worldwide attention. The logo for this year’s Ars Electronica festival was his 2006 visualization of a proton collision forming a microscopic black hole.

“There was a niche in the market,” he said. ”I’m at the intersection of physicist, computer scientist and designer.”

Xu and Pequenao realized they could help one another. Along with Pequanao’s student, Henrique Carvalho, they teamed up to become the first ATLAS group working on interactive multimedia.

Their new project, the ATLAS Virtual Interactive Online Navigator, or AVION, is a Web-based application that will soon be accessible to anyone with a laptop, smartphone, game console or other device connected to the Internet. Still in the alpha phase, AVION allows explorers to take guided tours of the experiment that start in the parking lot and delve underground into the LHC where collisions are occurring, or to examine individual pieces of the detector. Keeping up with Hollywood, the whole thing takes on an extra dimension with a pair of 3D glasses.

AVION takes the tour of CERN underground. Image courtesy of Neng Xu

AVION is designed to operate in a Web browser from any part of the world. Xu and Pequenao have visions of interactive games in the future in which players can build ATLAS and do their own physics analysis. And since the Unity engine is free online, the source files will be available to anyone who wants to download them, allowing users to modify AVION and create their own virtual CERN world.

“Add dinosaurs if you want,” said Xu.

Both he and Pequenao see AVION as a potentially valuable education tool for students and the public. “If you really have to build ATLAS, you will learn a lot,” Xu said.

The team expects AVION will be revealed to the world in the coming months via the ATLAS website.

Amy Dusto

No Comments »

CERN brings hardware into the open

July 15, 2011 | 3:55 pm

The Open Hardware Repository was inspired by the success of open-source software. (Image courtesy CERN.)

Hardware and software go hand in hand – one doesn’t work without the other. Despite being so closely linked, the two industries operate very differently. For the most part, hardware is produced in isolation and product designs are concealed by manufacturers, while software is created in a largely open and collaborative environment, available for anyone to use.

Javier Serrano, a hardware designer for accelerator systems at CERN, set out to change that. Three years ago, his software design colleagues were developing device drivers – the interface between a piece of hardware and software applications – with the Linux open-source operating system. Serrano noticed that they enjoyed being part of a community where they had access to high-quality products and could seek help whenever they needed it.

“I’m a hardware designer but I would love to work in that kind of environment,” Serrano said. He wanted to mirror the open software culture by creating a place where hardware designs could be shared, critiqued and modified by anyone, in a way that benefitted designers, manufacturers and consumers alike.

A year later the Open Hardware Repository was born. While it is not the first attempt at providing a space for hardware designers to share efforts, the repository is unique in that it is geared toward professional designers, primarily from publicly funded laboratories or academic settings, who create hardware for physics experiments. In the two years since its launch, the repository has hosted more than 40 projects, both from within CERN and from external designers.

Alan Langman, from the University of Cape Town in South Africa, was among the first to submit a design to the open hardware repository. His group’s project, dubbed Rhino, is a circuit board for radio-based technologies such as radar and radio astronomy. The repository will allow the technology designed for his group’s needs to be used by other experiments all over the world.

“We received excellent, multinational technical review support and design guidance,” Langman said. “As a result, the first version of our board worked with only minor errors. The opportunity to consult with people from CERN and other top institutions has been priceless.”

Peer review is just one benefit of open hardware, Serrano says. Sharing information also cuts down on multiple efforts to create similar devices or solve the same problems. This leaves room for designers to focus on what they know best, rather than starting each project from scratch and debugging along the way.

“We have limited resources and a tight schedule,” said Joseph De Long, a technology architect at Brookhaven National Laboratory. De Long designs accelerator systems for the future National Synchrotron Light Source (NSLS-II) and is an avid proponent of open hardware. “This initiative creates a group of developers with a common goal: improve the hardware that gets released,” he said.

This is not only a boon for designers, but an incentive for industry to become involved. Not spending resources on development lowers a manufacturer’s entry cost into a particular product. While one manufacturer can’t have exclusive production rights for an open hardware design, they can drive revenue by providing testing, support and guarantees.

A circuit board designed within the context of the Open Hardware Repository bears the CERN logo. The reverse side includes the words, "Licensed under CERN OHL." (Image courtesy CERN.)

“There are many opportunities for industrial partners to advance our innovations and profit from the value they add,” De Long said. Just last month, CERN accepted the first bid from a manufacturer to produce one of its open hardware designs.

The openness doesn’t stop there. Many devices designed at CERN are done so under the newly created Open Hardware License, which requires manufacturers to include all design and production documentation with shipments of open hardware products. The license ensures that anyone who buys open hardware devices can, in principle, modify or manufacture the design on their own. If someone modifies an open hardware design, they must include these changes in the repository and release them under the same license.

“The open hardware repository was a very specific idea. Although we were quite familiar with open source software, when [Serrano] brought the idea to us, it was the first time we looked into the open hardware movement” said Myriam Ayass, legal advisor for the Knowledge Transfer group at CERN and author of the CERN Open Hardware License. “We were very interested in the idea, and in doing it in a way that meets our mission and constraints,” she said. The open hardware movement fits in with the KT mission to foster collaboration and disseminate CERN-developed technologies as widely as possible.

In the end, the spirit of open source comes full circle. Software engineers – whose community inspired the open hardware repository – can write more efficient programs if they understand the guts of the hardware. Both sides benefit.

“Open hardware can learn a lot from open software,” De Long said. “The goal of open hardware is to have a place to retrieve solutions to our design challenges. We hope to ‘pay it forward’ and help create a source of tried and true designs that can be leveraged in new projects. In the future, we can focus our resources on the technology requirements unique to new machines while drawing from past experience.”

Read the CERN press release

Lauren Rugani

1 Comment »

Testing out the new and improved internet today

June 8, 2011 | 10:16 am

Today, June 8, major organizations around the world are participating in World IPv6 Day, a 24-hour test run to try out a new, expanded version of the Internet. Fermilab is among the organizations that will participate in the event created by the Internet Society to promote a new version of the Internet protocol called IPv6. Fermilab Today published the following story about their role in today’s global event.

The new Internet is coming and Fermilab wants to be ready.

Today the laboratory will take part in World IPv6 Day, a global effort to test out a new, expanded version of the Internet.

This new 24-hour event was created by the Internet Society to promote awareness of a new version of the Internet protocol called IPv6. Fermilab, along with Google, Facebook and other organizations that rely heavily on Internet usage, are actively participating by creating websites with IPv6 addresses and testing access to them.

“IPv6 represents the future of the Internet. Efforts are already under way to make sure we are ready for that future,” said the Computing Division’s Phil Demar, who is in charge of organizing Fermilab’s participation in IPv6 Day.

Computers and other networked devices use unique IP addresses to access the Internet. The older version of IP, called IPv4, uses a 32-bit address space, containing a familiar set of four 8-bit numbers (e.g. 131.225.103.37). IPv6 uses 128-bit address space. Its addresses are much longer and are represented with hexadecimal numbers (e.g. 2001:400:2410:50:3d8e:e20a:bf50:39e2). IPv6 websites look and behave like IPv4 websites. The IP address length will be the only noticeable change.

IPv4 supports about 4 billion possible addresses. The last blocks of free IPv4 addresses were assigned in February. The new IPv6 address format provides an octillion times the amount of existing IPv4 address space. Most computers purchased today are capable of using both IPv4 and IPv6 addresses.

Fermilab is currently working to ensure its public websites and email services will provide IPv6 accessibility by September 2012. The Computing Division plans to have IPv6 support for other laboratory computing systems by September 2014. A sample Web page is available for visitors to check if they have an IPv6 address for their computer or device.

For more information on World IPv6 day, please visit the official Internet Society website.

— Kimberly Myles

 

Guest author

No Comments »

The case of the missing proton spin

June 7, 2011 | 10:29 am

This story first appeared in iSGTW on June 1, 2011.

Researchers use the cloud to shed light on a longstanding mystery

It’s been nearly 25 years since the European Muon Collaboration made a startling discovery: only a portion of a proton’s spin comes from the quarks that make up the proton.

The STAR Detector. Image courtesy of Brookhaven National Laboratory.

The STAR Detector. Image courtesy of Brookhaven National Laboratory.

The revelation was a bit of a shock for physicists who had believed that the spin of a proton could be calculated simply by adding the spin states of the three constituent quarks. This is often described as the “proton spin crisis.”

“At that time people realized protons are not just a sum of three quarks stuck together like Lego-blocks,” said Jan Balewski, an MIT-based member of the Solenoidal Tracker At RHIC (STAR) experiment. “Protons are dynamic systems of interacting constituent quarks, gluons, and sea quarks.”

Gluons are massless spin 1 particles that “glue” the parts of a proton together; in this case, those parts would be two up quarks and one down quark. Sea quarks are quark-antiquark pairs that pop into existence and then annihilate each other almost immediately; their presence can contribute to the proton spin, making them a factor worth taking into consideration.

It has been postulated that the spin of the proton not only included spin from the three quarks from which it is built, but also from sea quarks and gluons. In fact, for a long time, physicists suspected that the remaining spin came from gluons. But as with the quark spin, experiments have shown that gluon spin can only account for a small fraction of the missing proton spin. The remaining proton spin should come from the orbital motion of the quarks, gluons, and sea quarks – and at the moment, the only direct measurements scientists know how to make are of the contribution from the sea quarks.

“Since previous experiments could not distinguish between quark and antiquark contributions, part of the RHIC/STAR spin program was set to unravel this puzzle,” said Balewski.

Unpacking quark spin contributions

The question was: how is it that the quarks spins’ contributions to the proton spin is only a small fraction of what was expected? To answer that, we need to learn more about where the quark spin contribution is coming from.

The three concurrent Relativistic Heavy Ion Collider experiments are ideally suited to answer that question, Balewski explained. RHIC, situated at Brookhaven National Laboratory, is the only collider in the world that will create polarized proton beams in which the spin state of the majority of the protons will be aligned with direction of the beam. This allows the physicists to study the correlation between the spin orientation of the proton and its constituents.

The STAR collaboration consists of approximately 550 researchers at 55 institutions interested in exploring properties of the proton and also characterizing the quark-gluon plasma produced in collisions of heavier ions.

“I’m exploring with a group of spin-researchers at STAR the properties of W-boson events produced in about 1% of data recorded by the STAR detector from proton-proton collisions during this year’s data taking period,” Balewski said.

There are two kinds of W bosons. A W- boson is created when an up antiquark and a down quark from two colliding protons interact; conversely when a down antiquark and an up quark interact, a W+ boson occurs. Since the only antiquarks in a proton are sea quarks, and sea quarks always occur in quark-antiquark pairs, analyzing the W boson events can tell researchers how much of a proton’s spin comes from up and down sea quarks.

Although there are four other types of sea quarks (strange, charm, top, and bottom) which this measurement doesn’t account for, they all occur less frequently than the up and down quarks, with the strange quark being the next most common. As a result, some uncertainty about the composition of the quark spin contribution will remain. Nonetheless, what we do learn from these experiments remains valuable. Spin is central to a variety of scientific concepts and technologies, including the Magnetic Resonance Imaging machines that are used in hospitals around the world.

“The visible matter of the universe consists predominantly of proton-like particles,” Balewski said. “If the results of our experiment cause a revision of our understanding of the proton makeup this will impact how we describe visible matter in the universe.”

From data to results

With the possibility of such a payoff – not just for the W experiment, but for other STAR experiments as well – it’s only natural that STAR researchers are eager to analyze their data and find out what it shows. But after five months of data taking, they typically must wait another ten months to complete detector calibration, reconstruction, and analysis.

That’s just one of the reasons why the STAR software team has been eager to explore how cloud computing might enable STAR experiments to elastically vary the computing resources they are using.

“What was more important for STAR was that almost-real time event processing would be achieved and the analysis of the W events provided one opportunity for feedback to the experiment,” Balewski said. “We can see certain expected characteristics of measured W events and tell the crew taking STAR data that all detector components work well, or direct their attention to those which need to be fixed.”

Unfortunately, real-time processing of all of the STAR data would require continuous access to about 10,000 cores. Given that the entire STAR collaboration shares a cluster of only 2000 dual-core machines, this simply wouldn’t be possible.

To explore the opportunities cloud presents, an MIT-based computing team lead by Balewski adapted the W boson workflow to take advantage of the Magellan cloud.

Magellan consists of two government-funded cloud computing testbeds. One, Magellan at the National Energy Research Scientific Computing Center (NERSC) in Berkeley, California is based on Eucalyptus, a widely used open source cloud platform. The second, based at Argonne National Laboratory near Chicago, Illinois, hosts two clouds; one runs the OpenStack software while the other uses the Nimbus toolkit.

The result of the team’s efforts was a real-time cloud-based data processing system that functions as a self-adjusting assembly line and handles variable throughput. No human intervention is needed, and there is no supervisor process that orchestrates the entire data flow. Every stage of the process is governed by local rules designed to handle time-outs and refusals from other elements by waiting a few minutes and then starting over.

Two independent processes on the compute cluster at Brookhaven check every half hour for new event files, and uses Globus Online to transfer those they find to the scratch disk reserved at NERSC. Every two hours, a third independent process takes a snapshot of the calibration data stored at Brookhaven, which changes much less rapidly than event data taking.

At NERSC, 20 eight-core virtual machines (VM) are running the STAR analysis software. Once per 24 hours, at a fixed time chosen at random, a cron job running on each VM pulls the most recent calibration snapshot from the cluster back in Brookhaven. Then the local copy of the calibration data on each VM is replaced; since each VM initiates this process at a different time, this ensures that the VMs always have a fairly recent copy.

Meanwhile, each VM can run eight jobs at a time to occupy all its cores. When a VM “notices” that it is running less than eight jobs, it requests a new raw event file from the scratch disk. (This request specifies the last valid timestamp for which the VM has calibration data; the response will search the scratch disk for a file that meets that criteria).

“The main challenge was to preserve independent, unsupervised raw file reconstruction on different VMs without processing the same file multiple times,” Balewski said.

They did this by using atomic rename operation, which renames a selected raw event file and passes the new name to the VM that requested a new file. If multiple VMs try to access the same file at the same time, only one of the atomic rename processes will succeed. The remaining VMs will continue to request files periodically until their request succeeds; the result is that eventually, either all of the VMs will be analyzing data on all cores or the pool of events on the scratch disk will be empty.

The analyzed events are sent back to Brookhaven via Globus Online, where they are archived and available for researchers to access.

The result

Over the last two months, the team has expanded this system. Today, they run a coherent cluster of over 100 VMs from three Magellan resource pools – Eucalyptus at NERSC, Nimbus at ANL, and OpenStack at ANL. The total number of cores has exceeded 800, and they expect to cross the threshold of 1000 parallel jobs soon.

If everything goes according to the new timeline, the W boson results should be ready to present at conferences six months earlier than in previous years. But, as noted earlier, the benefits go much further than that.

“The immediate access to reconstructed data has a significant psychological aspect,” Balewski said. “We can discuss how many Ws we measured last week, check if they look the same as those measured two weeks ago, and conclude that the detector is stable. We can also work immediately on improving and fine-tuning the W-finding algorithm and clean up the results while data are being taken. This accelerates analysis by many months.”/p>

Said Balewski, “Everybody wants his results to be shown at conferences as soon as possible. Using cloud computing provides new means to accomplish it.”

Miriam Boon

1 Comment »

Fermilab releases a new version of Scientific Linux

March 3, 2011 | 10:37 am

This story first appeared in Fermilab Today on March 3.

The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers. The photo shows computer clusters in the laboratory's Grid Computing Center.

The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers. The photo shows computer clusters in the laboratory's Grid Computing Center.

For more than 12 years, Fermilab has supplied thousands of individuals in the scientific community with the operating system that forms the foundation for their exploration of the universe’s secrets. The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers.

The newest version of the Scientific Linux is now available.

Fermilab began packaging and distributing Scientific Linux in 2004 to the broad high-energy physics community. At that time, it was used on only 1,500 machines. Today, Scientific Linux is run on tens of thousands of machines and is the operating system that powers some of the world’s largest physics experiments, including some experiments at the Large Hadron Collider. The newest version, Scientific Linux 6, is put together by the Fermilab Computing Division, specifically the Fermilab Experiments Facilities Department, and by DESY, CERN and other laboratories and universities across the world.

“This version of Scientific Linux continues a tradition of technical excellence,” said Jason Allen, head of Fermilab Experiments Facilities Department in the laboratory’s Computing Division. “This product is the result of users worldwide who have contributed, tested and provided feedback for this release.”

Fermilab modifies Scientific Linux, the base product, to include security measures and other laboratory-specific elements to create Scientific Linux Fermi. The newest version of Scientific Linux Fermi 6 will be released at Fermilab later this year.

– Kimberly Myles and Edward Simmonds

Guest author

No Comments »