Click here to go to the TACC Home Page

GRIDS AND COLLABORATION AT TACC

The Texas Advanced Computing Center (TACC) has long provided researchers at The University of Texas at Austin with access to high-end computing hardware. Current high-end systems include Lonestar, a massively parallel Cray/Dell 1024-processor computer delivering more than 6 teraflops of power, and Maverick, a shared-memory Sun E25K machine with 16 graphics elements to facilitate advanced scientific visualization and data analysis, plus numerous additional computing, visualization, and storage resources. TACC has also long provided expert consultation and assistance with the computational projects of university researchers.

In the last few years, TACC has become much more than an advanced computing facility. TACC has initiated an extensive R&D program spanning high-performance computing, scientific visualization, data-intensive computing, and distributed and grid computing. "Grid computing has emerged as a technology that enables researchers to share resources, data, and knowledge in new ways, on new scales," says Dr. Jay Boisseau, Director of TACC. "TACC has emerged as one of the national leaders in grid computing through its role in the TeraGrid and its leadership of projects such as UT Grid and the GridPort and GridShell software technologies."

Boisseau also notes that a large number of discipline-specific grid efforts have emerged over the past period, and some of them are even merging into national grid efforts. To keep our community informed about these efforts, TACC recently invited Professor Paul Avery of the Physics Department of the University of Florida to give a talk about the grid efforts in which he has been involved and the general coalescence of such efforts nationally and internationally. The talk was cosponsored by the university's Physics Department, and we print below the following summary/interview with Dr. Avery for the benefit of those unable to attend.

Summary of a Talk by the University of Florida's Paul Avery



Paul Avery
 
"There is no better way to build a scientific collaboration than to do it from within a grid structure," says Paul Avery.

In a special seminar on May 5, jointly sponsored by TACC and the Physics Department at The University of Texas at Austin, Avery, of the University of Florida described new grid developments that have taken place over the past several years. He has been the main investigator behind a number of computational grid efforts in high-energy physics, and he has helped to expand the efforts to embrace new disciplines and new tasks, including education and outreach.

"Disciplinary grids began by connecting computational resources, but they have become much more than high-speed connections to shared machinery," Avery says. They unite the members of widespread collaborative efforts. In high-energy physics, a large-scale collider experiment may involve hundreds of investigators at scores of institutions. The new Large Hadron Collider (LHC) at CERN in Geneva, scheduled to come online in 2007 together with its four main collision-detecting experiments, already promises, for just one of those experiments, to involve several thousand investigators and their groups at several hundred institutions worldwide. "There is no way to deal with the magnitude, the level of data flow, analysis, visualization, discussion, and collaboration involved, in an ordinary way. Such an experiment is nearly inconceivable outside of a well-connected Grid that functions on many levels," Avery says.

Some Grid History

Avery is particularly well qualified to speak on this topic, since his own efforts in high-energy physics have led him over the past six years into a fascinating landscape of grid efforts. He is Director of the GriPhyN Project, a $12 million effort funded by NSF, begun in 2000, which involves a dozen universities, the San Diego Supercomputer Center, and three national laboratories. He also leads the International Virtual Data Grid Laboratory (iVDGL), also funded at $14 million by NSF, which links eighteen universities, four national laboratories, and a large group of foreign partner institutions. The iVDGL was born in 2001. Avery is also a member of the Particle Physics Data Grid (PPDG), funded by the Department of Energy in 1999. "All of these efforts, which involve about 150 investigators, are now coordinated internally to meet broad goals," Avery explains, "and in fact they are now linked as the Trillium Grid Partnership in the United States, acting as one entity when coordinating internationally."

In addition to connecting resources, participants in Trillium have worked to enhance resource interoperability and link storage, networking, and compute/visualization hardware around the world. One of the main products is the Visualization Data Toolkit (VDT). VDT works with the NSF's National Middleware Initiative (NMI) releases and enables working groups to use planning and scheduling tools and execution and management tools to drive work and data flows.

An example is the Sloan Digital Sky Survey, a major effort within the astronomical community to map one-sixth (and now one-fourth) of the visible contents of the night sky, which uses tools developed in GriPhyN to calculate the size and distribution of clusters of galaxies. These distributions are then used within the National Virtual Observatory effort (an NSF Information Technology Research project) as part of another effort to create a complete mosaic of sky images at all wavelengths of the electromagnetic spectrum.

One of the main scientific drivers of the grid efforts is the LHC at CERN. This experiment will begin in a 27 km elliptical tunnel under France and Switzerland in 2007. Two of the major experiments around the ring are ATLAS and CMS (the "compact" muon solenoid, a smaller detector than ATLAS that is nevertheless several stories tall), both of which will detect particle collision events in their search for the origins of mass--new fundamental forces, symmetries, and particles that constitute the building blocks of the known universe. Already the research teams are collaborating, in an effort called Grid3, to simulate the collider events and determine what percentage of the events need to be inspected closely to find new physics.

Grid Infrastructure Grid Infrastructure
"It would be impossible," says Paul Avery, "to link large-scale scientific efforts worldwide without this kind of grid infrastructure."

What Avery finds amazing is that the very process of building such a grid, involving all of the stakeholders in an area of science, seems to be the best--perhaps the only--way to incorporate everything from the concerns of individual investigators in remote places to the need to educate students at all levels to the broadest cross-collaborative management issues.

Open Science Grid

The scientific concerns of teams at the various Grid3 institutions have become very broad, even including biological research. This began with biomolecular analyses based on crystallographic research done with the synchrotron radiation from particle accelerator beams, and it now extends to the genomic analyses devolving from the biomolecular determinations. For these reasons, Avery and the Grid3 leadership are transforming their effort into a new one, which they expect to launch before summer 2005, called the Open Science Grid (OSG).

"We've come to a number of general conclusions that we think will make OSG a very successful effort," Avery says. "One is that grids cannot come up and down at the whim of particular disciplines or experiments; we need a persistent, viable, and general infrastructure to support these kinds of research.

"Another is that an OSG must operate as a facility, able to supply differing research groups with resources as needed and able to redirect resources to high-priority projects. This means a standard set of tools, services, error recovery procedures, documentation, and, most important, the human resources behind such a facility--all of which have been developed in the efforts to date."

Avery adds that testbeds play a vital role. "We need to try out new tools and services on mini-versions of our grids before we risk bringing down the entire enterprise owing to problems we could have anticipated," he says. "Finally, we need to be adaptable, able to take on new data-intensive projects arising from new scientific discoveries. We want to take testbed-tested and validated tools and applications and scale them to the sizes needed for cutting-edge research."

Open Science Grid Structure Open Science Grid Structure
The Open Science Grid structure emphasizes education and outreach via internships and summer schools like the one held at South Padre Island, Texas, in 2004. Another is planned for the summer of 2005.

A National and International Research Infrastructure

"Grid and advanced networking projects in North and South America, Europe and Asia are providing new capabilities that will allow researchers in remote or poor regions to participate effectively in leading edge international research," Avery says. He helped to organize a conference on Grids and the Digital Divide held in Rio de Janeiro in February 2004, and a second workshop that took place in Daegu, Korea, at the end of May. "Grids have a dual character: they distribute, but they also centralize," he notes. "By virtue of these two directions of activity, the members of grid collaborations are able to pull in the have-nots--the students of poorly connected regions--to skip generations of development and become full participants in worldwide research activity."

Avery spent part of his visit to the university discussing ways in which university scientists can connect to grid efforts. Avery is looking forward to working with TACC to share expertise in grid computing R&D. "TACC has a large, talented Distributed and Grid Computing group," he says, "with successes in deploying TeraGrid and UT Grid and a portfolio of interesting R&D projects." These include work on portals, interfaces, information services, and scheduling prediction services. "All of this would be valuable to OSG, along with TACC's experience in providing very large scale resources to computational researchers." Finally, Avery noted that, as a member of TeraGrid and potentially of OSG, TACC could help in efforts to bridge these two large-scale grids, while its UT Grid technologies and experiences, integrated with OSG, might provide a model for university researchers at other campuses.

--Merry Maisel

Research Feature - June 8, 2005