EMERGING LIFE SCIENCES
I
Chair: Barbara Horner-Miller, Arctic Region Supercomputing Center
Time: 10:30-Noon
Room A207/209
Applications in Computational
Biology and Computational Chemistry: Similarities and Differences
Eamonn O'Toole, Compaq
Computers
Computation is rapidly
assuming a central role in biology. Such recent feats as the completion
of the draft human-genomes by Celera Genomics and the International
Human Genome Sequencing Consortium and their annotation would not
have been possible without significant computational resources.
Some of the largest computers in private and public hands are now
devoted to biological problems. Chemists have made extensive use
of computation for considerably longer than biologists, and computational
chemists are responsible for some of the largest and most important
scientific applications in current use.
Chemists and biologists
tackle many problems that are closely related. In addition, some
large computer installations are shared between chemists and biologists.
We will outline some of our experiences benchmarking and analyzing
performance of a number of applications in computational chemistry
and biology. We are particularly interested in similarities and
differences between these applications, their behavior and the stresses
that they place on systems. This work is part of an on-going program
to better understand these applications.
Computing Requirements
for the Bioinformatics Revolution: Genomics, Proteomics and Cellular
Machinery
Bill Camp, Sandia National
Laboratory
Biology and medicine
are entering a new age dominated by high-throughput experimentation
and computation. How quickly the door to that future is opening
is signalled by the staggering progress in developing a map of the
human genome as well as those of several other species. Those efforts
have required unprecedented increases in experimental output made
possible in part by the the development of shotgun sequencing and
the subsequent automation of DNA sequencing. The key to using these
high throughput methods has been the ability to reconstruct the
entire sequence from a redundant set of small fragment sequences.
This assembly is in turn enabled by the introduction of extreme
computing into biology by Celera Genomics and its attendant use
by both Celera and the Public Genome Project. This first stage of
genomic biology utilized terascale computing for the first time
in biology. It was dominated by informaticsthe searching ,
comparison and manipulation of large alphanumeric datasets. Little
or no floating point computation was required; and the work was
essentially embarrassingly parallel and dominated by unusually intense
data I/O. On-going refinements of the genomics computing methodologies
include reducing I/O requirments by replacing I/O with parallel
computing techniques.
In the next stage, early
stage proteomics, the calculations will be even larger than those
required for genomics. However, they still will have a strong informatics
flavor. Since the calculations will be larger, there will be even
more pressure on I/O systems which will in turn drive additional
emphasis on true parallel computing methods.
Later stages in proteomics
which include elucidation of protein structure and function and
protein-protein interactions will see orders-of-magnitude
increase in computing requirements with increased emphasis on floating-point
operations. The long term goals for computation in biology and medicine
include gene-based species modification (e.g. for agriculture or
environmental remediation) and intervention strategies for medicine
and counter-terrorism. In all cases, being able to understand protein
structure and function will be a critical step. Further refinement
will involve understanding signalling and metabolic pathways, enzymatic
catalysis,protein target identification as well as the design of
interventions. These longer-term goals pose computing challenges
which are both floating point intense and by current estimates will
involve sustained, affordable petascale processing.
Finally possibly the
most exotic challenge posed by biology is that of simulating the
biological cell at a system level. This involves far more than adapting
the simulation methods of physics, chemistry and engineering to
biological systems-- although that will be a critical component
of the solution. The cell is so much more complex than anything
we have attempted to simulate in the past; and so many of the underlying
processes will remain under-characterized that radically new simulation
methodologies will be necessitated: for example, our inability to
characterize underlying details (e.g. a myriad of reaction rates)
may require inherently non-deterministic simulation methods. Of
course, a detailed atomistic simulation of even the simplest cell
is not only computational infeasible for the foreseeable future
but also would be so complex as to challenge interpretability.
I will discuss architectural
strategies for meeting the computing challenges of the revolution
in biological sciences, including scaling needs for computation,
I/O and networking.
EMERGING LIFE SCIENCES II
Chair: Ty Rabe, Compaq Computer Corporation
Time: 1:30-3:00 PM
Room A207/209
The Computational
Impact of Genomics on Biotechnology R&D
Scooter Morris, Genentech
The excitement surrounding
the completion of the first draft sequence of the Human Genome has created
significant interest in genomics, the study of entire genomes, and has
greatly influenced how the pharmaceutical industry discovers and characterizes
new medicinal compounds. Two major areas of impact are the way in which
we approach the search for new proteins of interest and the computational
tools that we use to perform that search. Increasingly, the search for
new proteins or lead compounds has moved from the laboratory to the
computer. This migration has not been sudden, and is not a direct result
of the completion of the Human Genome sequencing effort, but rather
is a result of the increasing capacity and performance of modern computing
systems, and the incredible increase in the amount of available sequence
data. This talk will discuss both of these evolutions, and conclude
with a survey of the current computing systems and architectures that
are used in the biotech industry for primary sequence discovery and
other research activities.
New Wine in Old Bottles:
The Use of Vector Processors and Fine-Grained Parallelism in Genomic
Analysis
Stanley K. Burt, Advanced Biomedical Computing Center, National Cancer
Institute
[Authors: Jack Collins, Robert Stephens, and Stanley K. Burt]
The common assumption is
that biological data analysis problems are suitable for parallel computing,
particularly by cluster computing. However, certain problems, in which
very large amounts of data are involved, can be approached by other
computer techniques. We demonstrate in a new methodology that certain
techniques used for cryptography can be useful for pattern recognition
in biological research, such as finding tandem repeats in DNA sequences.
This new method takes advantage of special hardware capabilities of
the Cray computer architecture, the vector registers, large shared memory,
fine grain parallelism, and also leverages additional speedup from sequence
compression.
The identification of simple
tandem repeats within DNA sequences is an extremely powerful tool for
exploring genomes. These specific repeat elements (or microsatellites)
are frequently polymorphic and thus can be used for many purposes ranging
from diagnostic primers used to increase the marker density surrounding
regions of interest, mapping new genes and forensic sciences. We report
here the development of a new, extremely rapid tandem repeat finder
that exhaustively determines all possible repetitive elements up to
16 bases in length.
We describe and demonstrate
the utility of the method in the identification of simple tandem repeats
within the entire human genome. By focusing on known coding regions,
we find many repeats, possibly linked to diseases, previously not described.
The data has been assembled into a relational database that is web-accessible
and allows searching for elements based on genomic region or gene. Beyond
this particular application, this methodology will allow analyses that
previously were beyond current computational capabilities.
EMERGING LIFE SCIENCES III
Chair: Ellen Roder, Cray, Inc.
Time: 3:30-5:00 PM
Room A207/209
Computing
Challenges for Structure-based Drug Design on a Genomic Scale
Tod M. Klingler, Structural GenomiX
At Structural GenomiX we are integrating experimental approaches
for protein structure determination with computational modeling
methods, including comparative modeling, ab initio prediction and
molecular dynamics, to produce the most comprehensive and accurate
view of protein structure space. Using this view of protein structure
space as a starting point, large-scale structure-based drug design
will be used to greatly improve the drug development process. Computational
techniques for docking chemical structural to protein structures
are the core of this platform. The required algorithms for protein
modeling and chemical docking are compute-intensive and often require
specific tuning. In this talk I will describe several of these computational
approaches, their integration, and some of the automation and high-throughput
computing challenges we are facing in developing this new platform
for drug discovery drug.
National Digital Mammography
Archive
Robert Hollebeek, University of Pennsylvania
The National Digital Mammography Archive is funded by the National
Library of Medicine to design and implement a secure digital archive for mammography and
associated reports using Next Generation Internet technologies, including high
bandwidth optical networks, quality of service, scalable systems, and scalable applications.
Images and reports will be rapidly available wherever needed for medical or
educational purposes thus improving screening, diagnosis and ultimately,
patient care. Researchers from the Universities of Pennsylvania, Chicago,
North Carolina and Toronto, team with advanced computing groups from the
University of Pennsylvania (NSCP) and BWXY (Oak Ridge ACT),
to develop integrated systems for high-speed networking, distributed archiving,
and secure applications. The talk will demonstrate how images and patient data
can be securely moved to and from hospitals to an archive and how the applications,
including computer assisted diagnosis (CAD), data mining, and teacher
training collections, could be used for clinical and research purposes.
WEDNESDAY, NOVEMBER 14
HPC COMPUTING INFRASTRUCTURE
I
Chair: Will Murray, CISCO
Time: 10:30-Noon
Room A207/209
StarLight:
Optical Switching for the Global Grid
Tom DeFanti, University of Illinois
STAR TAP, a persistent infrastructure to facilitate the long-term
interconnection and interoperability of advanced international networking,
has demonstrated the importance of providing advanced digital communication
services to a worldwide scientific research community. However,
there are clear indications that 21st-century grid-intensive "e-Science"
applications will require a networked "cyber-infrastructure"
and set of services that are more sophisticated, with much higher
capacity potential and substantially higher performance. The University
of Illinois at Chicago, in collaboration with Northwestern University
and Argonne National Laboratory, and in partnership with CANARIE
(Canada) and SURFnet (Holland), is now creating the Optical STAR
TAP, or StarLight www.startap.net/starlight.
StarLight is an advanced optical infrastructure and proving ground
for network services optimized for high-performance applications.
The StarLight facility, operational in the summer of 2001, is located
on Northwestern University's downtown campus at 710 N. Lake Shore
Drive in Chicago. StarLight provides the applications-centric network
research community with a Chicago-based co-location facility with
enough space, power, air conditioning and fiber to engage in next-generation
optical network and application research and development activities.
StarLight's architecture is designed to be distributable among opportune
carrier points of presence, university campuses, and carrier meet
points. And, because optical networks allow for a far greater degree
of network configuration flexibility than existing networks, StarLight
will provide the required tools and techniques for (university and
government laboratory) customer-controlled 10 Gigabit network flows
to be switched and routed to research networks and commercial networks,
empowering applications to dynamically adjust and optimize network
resources. StarLight welcomes the academic and commercial communities
to work with us to create a global proving ground in support of
grid-intensive e-Science applications, network performance measurement
and analysis, and computing and networking technology evaluations.
Evolution of Supercomputing Networks- from Kilobits to Terabits
Charlie Catlett, Aragonne National Laboratory
Just over 15 years ago the National Science Foundation created
NSFNET, a 56 Kb/s backbone network that connected a half dozen supercomputer
centers. Ten years ago, the US Gigabit Testbeds Initiative was unveiling
prototype networks running at between 600 Mb/s and 1.2 Gb/s, with
the most interesting application. Within five years, supercomputer
centers were connected at 155 Mb/s with networks such as vBNS and
ESnet. Today there are backbone networks running at 2.5 Gb/s with
many talking of upgrading to 10 Gb/s. Catlett will talk about two
projects attempting to push beyond the 10 Gb/s barrier. First is
NSF's [proposed] Distributed Terascale Facility (DTF) interconnect,
which involves a partnership between Qwest Communications, Argonne
National Laboratory, NCSA, Caltech, SDSC, and the Internet2 project.
The [proposed] DTF interconnect will couple the four high performance
computing centers at 40 Gb/s in early 2002. Second is the State
of Illinois "Illinois Wired/Wireless Infrastructure for Research
and Education," or I-WIRE. I-WIRE is an optical network that
provides both dark fiber and lambda services between six institutions
in Illinois (including Argonne and NCSA), providing optical connectivity
for the Starlight project as well as connectivity to multiple carrier
exchange points in Chicago.
HPC COMPUTING INFRASTRUCTURE II
Chair: David Culler, University of California, Berkeley
Time: 1:30-3:00 PM
Room A207/209
Bringing I/O Scalability
and Availability to Linux and AIX Clusters Lyle Gayne, IBM
As parallel and large scale computing has moved from specialized "Supercomputers"
(with their traditionally exotic technology and commensurate price)
to more flexible, cost-effective cluster environments, the scaling
of practical compute capability has created demands for comparable
scalability of I/O performance and capacity. The aggregation of
large numbers of not completely reliable modular components (be
they processors, network or disk) has simultaneously forced software
failure survivability into the same domain. A reasonable degree
of success in these two domains uncovers and forces further issues
to the fore. This technical presentation will discuss IBM's efforts
to meet this evolving set of Cluster I/O challenges in both Linux
and AIX cluster environments, highlighting the technical issues,
progress to date, and the still impending challenges.
Bringing Linux Clusters into the Enterprise
Jamshed Mirza, IBM
Linux is rapidly making inroads today in its traditional areas of strengthæAppliances,
Web Serving, and High Performance Computing. But Linux also has the potential
to be a key technology for the next generation of e-business - a potential
that will only be reached if real and perceived limitations, technical
and otherwise, in Linux and Linux Clusters today are removed. To that
end, IBM and others are working with the Linux community to make Linux
and Linux clusters more enterprise-capable, and are working with customers
and ISVs to encourage its wider use within the enterprise. This talk will
discuss the potentially strategic importance of Linux, position its capability
today relative to other mainstream Unix solutions for HPC, and investigate
possible scenarios of its evolution over time that will determine its
ability to attain its potential as a strategic e-business technology.
TIME MIGRATION IN THE OIL INDUSTRY
Chair: Ray Paden, IBM
Time: 3:30-5:00 PM
Scalability Analysis of Distributed 3D Prestack Time Migration
Kevin Hellman, Aliant Geophysical
3D prestack time migration is a seismic imaging application which is
well suited for parallel computation in distributed memory clustered
computer environments. The basic algorithm involves data aggregation
at a volume of output locations, using (potentially) all of the input
data at each of these output locations. Parallelization may be designed
in either output or input domains. Since the majority of the processing
time is spent in the summation kernel, time migration is often thought
of as "embarrassingly" parallel, and not much importance is
attached to the parallelization scheme. For seismic surveys of actual
exploration size, however, the details of parallelization can have a
dramatic impact on the scalability, and hence the runtime, of prestack
time migration. Simple timing models for three common approaches to
parallelization will be introduced which characterize the total throughput
time and parallel efficiency of the process with respect to machine
size, cpu performance, and speed of data movement. The turnaround time
of production sized jobs turns out to be highly dependent on the choice
of parallel algorithm, and the choice itself will change with the parameters
of the project.
Computational Elements, Requirements and Tradeoffs for Imaging
Normal-Incidence Seismic Data
Jim McClean, PGS Research
[Authors: Jim McClean and Steve Kelly, PGS Research]
Exploration seismic recordings are often processed to simulate an
experiment in which the source and receiver are coincident at the same surface
location. We outline an algorithm for imaging preprocessed recordings
of this type using an approximate form of the scalar wave equation.
This outline will consist of a description of the various
approximations used to reduce the computional cost while retaining acceptable
accuracy.
In general, huge datasets are handled with this method. Additional
constraints include available disk and memory capacities, I/O speed
and the underlying computational requirements of the algorithm. We
discuss the impact of these constraints upon our processing methodology.
We also comment on the style of parallelization that is most effective
for the algorithm, as well as its scalability.
THURSDAY, NOVEMBER
15
HPC IN ENTERTAINMENT
Chair: Steve Briggs, Compaq Computer Corporation
Time: 10:30-12:00
Room A207/209
Computational Challenges
in Computer Animation at Blue Sky Studios
John Turner, Blue Sky Studios
Since its inception in 1987,
Blue Sky Studios has used ray tracing for virtually all the images
it has produced, and it remains the only computer animation studio
to use this computationally intensive technique so extensively in
production. When one considers the hardware available to a small studio
in 1987, it's understandable that many in the industry questioned
Blue Sky's approach. However, the principal architects of the original
system, Carl Ludwig and Eugene Troubetskoy, believed from the outset
that ray tracing produces the best images and that "compute-intensive"
is better than "human-intensive." Indeed, the drive toward
more complexity and photorealism has made it increasingly difficult
to obtain the desired results using non physically-based techniques
without inordinate human effort. An example is soft shadows, which
are achieved naturally with ray tracing but require special techniques
with scanline, first-surface approaches.
Carl has been known to
say that "at Blue Sky we write software for the computers of
tomorrow. "While that was certainly true in 1987, advances
in computer hardware have brought tomorrow closer than ever.
In addition to our renderer,
other computationally-intensive aspects of computer animation, such
as fluid dynamics and cloth simulation will also be discussed.
Computational Challenges
in Creating Volume Rendered Galactic Animations
Jon Genetti, University of Alaska Fairbanks/Arctic Region Supercomputing
Center
The San Diego Supercomputer
Center collaborated with the American Museum of Natural History
to produce a visualization of the Orion Nebula for the new Hayden
Planetarium. During the Space Show, viewers are transported 1500
light years to the heart of the nebula on an 67 foot digital dome
consisting of 9 million pixels. This 2 1/2 minute animation required
over 31,000 1280x1024 images and was rendered on SDSC's IBM SP using
over 900 processors during a single 12-hour period. Under a less
demanding time schedule, a second version was produced for high-resolution
flat displays and was shown in the Electronic Theater at Siggraph
2000. This sequence consisted of 4500 6400x3072 images and was rendered
on SDSC's SUN E10000s using backfill CPU cycles over a 4 month period.
In this presentation,
I will give an overview of the new Hayden Planetarium, the Orion
fly-thru development process, the importance of using HPC resources,
the tradeoffs/compromises made during development and the rationale
for the final modeling/rendering decisions. I will also give a preview
of the next collaboration that plans to generate and render a time-varying
volume dataset that will be several terabytes and require state-of-the-art
data handling and computation resources.
SCXY AS A MASTERWORK
Chair: Barbara Kucera, National Center for Supercomputing Applications
Time: 1:30-3:00 PM
Room A207/209
SC Global: Celebrating
Global Science
Ian Foster, Argonne National Laboratory & The University of Chicago
Science and engineering are evolving into increasingly collaborative,
distributed, multi-institutional, and often international activities.
The technologies that we use to practice science and engineering,
to communicate research advances, and to discuss future directions
must evolve also. The SC Global event at SC'2001 celebrates and
showcases this parallel evolution of work and technology. On the
one hand, it represents a technical tour de force, with advanced
collaboration and networking technologies used to link hundreds
of people at tens of sites on six continents; on the other, it incorporates
technical sessions that communicate recent progress on some the
most interesting collaborative and international science projects
currently in progress. In this talk, I both explain the technologies
that underlie the SC Global event and review the technical goals
and current status of some of the projects presented on SC Global,
including GriPhyN and NEESgrid.
SCinet: The Annual Convergence
of Advanced Networking and High Performance Computing
Steve Corbato, Backbone Network Infrastructure
For a period of approximately six months each year, a dedicated
team of network architects, engineers, and fiber expertsdrawn
from the leading national research centers and the national research
and education networksreconvenes to design, build, and operate
SCinet. This state-of-the-art network supports both the advanced
demonstration and varied general connectivity needs at each SCxy.
While this network quickly springs into existence and then carries
hundreds of Terabytes of data over its short lifetime of less than
a week, it has come to symbolize the increasing convergence of advanced
networking and high performance computingas evidenced, for
one, by the recent TeraGrid design.
Over the years, SCinet has grown to include significant efforts
in establishing high-performance wide area connectivity, deploying
both fiber-based and wireless networks throughout the venue, probing
and characterizing network performance, enabling the Bandwidth Challenge
for innovative applications, and demonstrating bleeding-edge network
technologies through Xnet. In this presentation, I will provide
a glimpse into the truly collaborative and often Herculean process
that creates this network and will highlight several implications
for the evolving field of distributed high-performance computing.
FRIDAY, NOVEMBER 16
VIRTUAL PRODUCT DEVELOPMENT WITH CAE I
Chair: Ed Turkel, Compaq Computer Corporation
Time: 8:30-10:00 AM
Room A102/104/106
Modeling Approaches
in FLUENT for the Solution of Industrial CFD Applications on High-Performance
Computing Systems
Tom Tysinger, Fluent
FLUENT is a widely used
commercial software package for modeling fluid flow and heat transfer
in complex geometries. It is capable of solving flows in both the
incompressible and compressible regimes. FLUENT is used by engineering
analysts and designers to reduce design time, improve product quality
and optimize performance. The solution of real-world fluid flow problems
requires both large memory and lengthy computation times. This has
driven the implementation of FLUENT on parallel computers, reducing
the turnaround time from days to hours, or from hours to minutes,
and allowing larger and more complex problems to be modeled with greater
fidelity. This presentation will describe some of the challenges involved
in making a variety of diverse physical models and flow solvers perform
efficiently on contemporary HPC architectures.
High Performance Simulation
and Visualization in Engineering Systems
Kamal Jaffrey, Delta Search Labs
[Authors: Ahmed Ghoniem and Kamal Jaffrey, MIT and Delta Search
Labs]
The increasing complexity
of products and the systems they comprise make traditional design,
development and testing difficult. High performance simulation of
engineering designs, in which complex physics, chemistry and dynamics
interact over a wide range of length and time scales contributing
equally to the performance of the system, are becoming possible thanks
to the recent advancement of massively parallel and cluster computing,
immersive visualization, and numerical algorithms. Of the many applications
of HPS "Grand Challenges", the so-called multi-scale multi
physics phenomena; combustion has received much attention due to its
critical role in many applications including power generation, transportation
and propulsion. Combustion simulation, where computational fluid dynamics
methods must be further refined to capture the fine scales of multi
species transport, and extended to incorporate chemical reactions,
is extremely demanding; it has been estimated that simulating an IC
engine operation accurately requires many days on 50+ teraflop machine!
The conflicting demands on a combustion system; including high efficiency,
safety and stability, high power density and extremely low emission,
makes simulation-based optimization over a range of conditions necessary
and very attractive to designers. Progress in simulation approaches,
including grid-free and Lagrangian methods, adaptive, moving and multi
grids methods, hybrid Eulerian-Lagrangian methods, fast chemistry
reduction algorithms, etc. is bringing this goal closer. The talk
will review progress in the field and summarize many of the remaining
challenges.
VIRTUAL PRODUCT DEVELOPMENT WITH CAE II
Chair: Mary Kay Bunde, Etnus Corporation
Time: 10:30-Noon
Room A102/104/106
Accuracy and Precision of Distributed Memory
Crash Simulation
Clemens-August, GMD National Research Center for Information Technology
[Authors: Clemens-August Thole, Juergen Bendisch, Otto Kolp, Mei
Liquan, Hartmuth von Trotha, Fraunhofer Institute for Algorithms
and Scientific Computing]
Numerical crash simulation on parallel machines shows indeterministic
results for certain car models. For a specific BMW car model,
the node positions of the crashed model may show up differences
of up to 14 cm between several executions on a parallel machine
for the same input deck.
Detailed investigations have shown that these effects are not
a result of the parallel execution or a wrong implementation.
The indeterminism of the parallel execution is a direct result
of numerical bifurcations. These bifurcations are caused either
by the numerical algorithms or are a feature of the car design.
In the case of the BMW car model, the scatter of the simulation
results is a direct consequence of buckling of the motor carrier.
A slight redesign of this motor carrier resulted in stable simulation
results for parallel machines.
Systems and Software Technology for Automotive
Crash Simulation
Ed Turkel, Compaq Computers
Jean-Pierre Bobineau, Radioss Consulting Corporation
[Authors: Francis Arnaudeau, Eric Lequiniou, MECALOG SARL; Ed Turkel,
Compaq Computer Corporation; Martin Walker, Compaq Computer EMEA;
Jean-Pierre Bobineau, Radioss Consulting Corporation]
As a result of increasing consumer and regulatory demands, the
automotive industry is investing heavily in simulation technology
to improve the crash-worthiness of their vehicles. Recent studies
have shown that crash simulation is the single largest consumer
of system resources in the engineering computing centers of the
major automotive manufacturers worldwide.
The trends in the use of crash simulation include:
Increasing the accuracy of simulation by increasing the amount
of detail in vehicle models while increasing the model's resolution,
resulting in much larger models
Increasing demand for more use of crash simulation to facilitate
vehicle design decisions, including the use of statistical techniques
to optimize vehicle designs.
Increasing pressure to reduce the cost of vehicle development,
resulting in greater use of simulation in virtual prototyping,
while also raising the visibility of IT costs, putting pressure
to lower the cost of computing.
The result of these trends is the increasing
use and size of simulation, while also reducing its cost. MECALOG
and COMPAQ are collaborating to develop crash simulation solutions
that utilize parallel processing on distributed memory systems to
provide highly scalable and accurate simulations, while driving
the cost of simulation down. The authors will discuss the parallel
processing technology used in MECALOG's RADIOSS-Crash combined with
the systems technology in COMPAQ's distributed-memory/clustered
Tru64 UNIX and LINUX-based Alpha systems, that enable significant
improvements in simulation performance and accuracy while driving
the cost of simulation down.