Highlights - November 2008

TOP Highlights

  • The slightly enhanced Roadrunner system, which broke the petaflop/s barrier last June, held on to its No. 1 spot. It still is one of the most energy efficient systems on the TOP500.
  • Seven U.S. DOE systems dominate the TOP10.
  • Intel dominates the high-end processor market with 75.8 percent of all systems and 87.5 percent of quad-core based systems.
  • Quad-core processors are used in 67 percent of the systems. Their use accelerates performance growth at all levels.
  • The most powerful system outside the U.S. is the Chinese-built Dawning 5000A at the Shanghai Supercomputer Center. It is the largest system which can be operated with Windows HPC 2008.
  • Hewlett-Packard wrestled the lead in market share by total systems from IBM, but IBM still stays ahead by overall installed performance.
  • Cray’s XT system series is very popular for big customers 10 systems in the TOP50 (20 percent).

Power consumption of supercomputers

  • TOP500 now tracks actual power consumption of supercomputers in consistent fashion.
  • Most energy efficient supercomputers are based on
    • IBM QS22 Cell processor blades (up to 536 Mflop/Watt),
    • IBM BlueGene/P systems (up to 372 Mflop/Watt)
  • Intel Harpertown quad-core blades are catching up fast:
    • SGI Altix ICE 8200EX Xeon quad-core nodes, (up to 240 Mflop/Watt) ,
    • Hewlett-Packard Cluster Platform 3000 BL2x220 (up to 227 Mflop/Watt),
    • IBM BladeCenter HS21 (up to 265 Mflop/Watt).
  • Cray’s XT4/5 with AMD quad-cores are in the same region (up to 232 Mflop/Watt)
  • These quad-core systems are already ahead of BlueGene/L (up to 210 Mflop/Watt).
  • Average Power consumption of a TOP10 system is 2.48 MWatt and average power efficiency is 228 Mflops/Watt.
  • Only 14 systems on the list are confirmed to use more than 1 MWatt of power.
  • Average Power consumption of a TOP50 system is 1.08 MWatt and average power efficiency is 193 Mflops/Watt.
  • Average Power consumption of a TOP500 system is 358 kWatt and average power efficiency is 132 Mflops/Watt.

Highlights from the Top 10:

  • The Roadrunner system at DOE’s Los Alamos National Laboratory (LANL) was built by IBM and in June was the first system ever to break the petaflop/s Linpack barrier. Since then, Raodrunner was slightly enlarged and held on to its number 1 spot with 1.105 petaflop/s. Roadrunner is based on the IBM QS22 blades which are built with advanced versions of the processor in the Sony PlayStation 3.  These nodes are connected with a commodity InfiniBand network.
  • Roadrunner was almost surpassed by the second petaflop/s system ever - the Jaguar system installed at the DOE’s Oak Ridge National Laboratory. Jaguar reached 1.059 petaflop/s shortly after its installation and might pass Roadrunner in the near future. It is a XT5 system manufactured by Cray.
  • The TOP10 features seven new or upgraded systems.
  • The No. 1, 2, 4, 5, 7, 8 and 9 systems are all installed at U.S. DOE laboratories and the No. 1 to 9 systems are all in the U.S.
  • The No. 3 system called Pleiades is a new SGI Altix ICE system installed at NASA Ames in Moffett Field, Calif. It grabbed this spot by a narrow margin with 487 Teraflop/s.
  • The No. 4 system is DOE’s IBM BlueGene/L system, installed at DOE’s Lawrence Livermore National Laboratory (LLNL) with a Linpack performance of 478.2 Tflop/s.
  • At No. 5 is a newer version of the same type of IBM system. It is a BlueGene/P system installed at DOE’s Argonne National Laboratory and it achieved 450.3 Tflop/s.
  • The No. 6 system is installed at the Texas Advanced Computing Center (TACC) at the University of Texas and recently received faster processors. Called Ranger, it is built by Sun using SunBlade x6420servers and achieved 433.2 TFlop/s.
  • The No. 7 system called Franklin is the second new Cray XT5 system. It is installed at DOE’s NERSC center at the Lawrence Berkeley National Laboratory and achieved 266.3 Tflop/s.
  • The No. 8 system is a Cray XT4 system installed at DOE’s Oak Ridge National Laboratory. It achieved a Linpack performance of 205 Tflop/s.
  • The No. 9 system is the enlarged Sandia/Cray Red Storm system. It is installed at DOE’s Sandia National Laboratories and achieved 204.2 Teraflop/s.
  • The No. 10 system is the first system on the list outside the U.S. It was built by the Chinese company Dawning and is installed at the Shanghai Supercomputer Center. It is the largest system in the TOP500 which can be operated with the Windows HPC 2008 operating system.

General highlights from the Top 500 since the last edition:

  • Quad-core processor based systems have taken over the TOP500 quite rapidly. Already 336 systems are using them. 153 systems are using dual-core processors, and only four systems still use single core processors. Already seven systems use IBMs advanced Sony PlayStation 3 processor with 9 cores.  The Linpack benchmark can utilize multi-core processors very well, which led to performance levels increasing above average across the whole list.
  • The entry level to the list moved up to the 12.64 Tflop/s mark on the Linpack benchmark, compared to 9.0 Tflop/s six months ago.
  • The last system on the newest list would have been listed at position 267 in the previous TOP500 just six months ago. This turnover rate is just above average after the TOP500 recorded the highest turnover in its history six month ago.
  • Total combined performance of all 500 systems has grown to 16.95 Pflop/s, compared to 11.7 Pflop/s six months ago and 6.97 Pflop/s one year ago.
  • The entry point for the top 100 increased in six months from 18.8 Tflop/s to 27.37 Tflop/s.
  • The average concurrency level in the TOP500 is 6,240 cores per system up from 4,850 six month ago
  • A total of 379 systems (75.8 percent) are now using Intel processors. This is virtually unchanged from six months ago (375 systems, 75 percent). Intel continues to provide the processors for the largest share of TOP500 systems.
  • The IBM Power processors and the AMD Opteron family are almost tied as second most common processor family with 60 and 59 systems each (12 percent and 11.8 percent). Both had only minor changes from six months ago.
  • Multi-core processors are the dominant chip architecture. The most impressive growth showed the number of systems using the Intel Harpertown and Clovertown quad core chips, which grew from 102 last November to 252 systems in June and now 293 systems.
  • The majority of remaining systems uses dual-core processors.
  • 410 systems are labeled as clusters, making this the most common architecture in the TOP500 with a stable share of 82 percent.
  • Gigabit Ethernet is still the most-used internal system interconnect technology (282 systems), due to its widespread use at industrial customers, followed by InfiniBand technology with 141 systems.
  • IBM and Hewlett-Packard continue to sell the bulk of systems at all performance levels of the TOP500.
  • HP took over the lead in systems with 209 systems (41.8 percent) over IBM with 188 systems (37.6 percent). IBM had 210 systems (42.0 percent) six months ago, compared to HP with 183 systems (36.6 percent).
  • IBM remains the clear leader in the TOP500 list in performance with 38 percent of installed total performance (down from 48 percent), compared to HP with 24.7 percent (up from 22.4 percent).
  • In the system category, Cray, Dell, and SGI follow with 4.4 percent, 4.2 percent and 3.4 percent respectively.
  • In the performance category, the manufacturers with more than 5 percent are: Cray (14.7 percent of performance) and SGI (7.2 percent), each of which benefits from large systems in the TOP10.
  • HP (188) and IBM (113) sold together 301 out of 306 systems at commercial and industrial customers and have had this important market segment clearly cornered for some time now.
  • The U.S. is clearly the leading consumer of HPC systems with 291 of the 500 systems (up from 257). The European share (151 systems – down from 184) is settling down after having risen for some time, but is still substantially larger then the Asian share (47 systems – unchanged).
  • Dominant countries in Asia are Japan with 18 systems (down from 22), China with 16 systems (up from 12), India with 8 systems (up from 6).
  • In Europe, UK remains the No. 1 with 45 systems (53 six months ago). Germany fell steeply but is still in the No. 2 spot with 24 systems (46 six months ago).

Highlights from the Top 50:

  • The entry level into the TOP50 is at 50.55 Tflop/s
  • The U.S. has about the same percentage of systems (58 percent) in the TOP50 than in the TOP500.
  • The dominant architectures are custom-built massively parallel systems MPPs with 66 percent ahead of commodity clusters with 3 percent.
  • IBM leads the TOP50 with 40 percent of systems and 42 percent of performance.
  • No 2 is Cray with 20 percent of systems and 27 percent of performance.
  • SGI is third with 14 percent of systems and 12.5 percent of performance.
  • HP has 4 percent of systems and 4.9 percent of performance.
  • 56 percent of systems are installed at research labs and 32 percent at universities.
  • There is no system using Gigabit Ethernet in the TOP50.
  • Cray’s XT is the most-used system family with 10 systems (20 percent), followed by IBM’s BlueGene with 8 systems (16 percent).
  • Intel processors are used in 28 percent of systems, behind of IBM’s Power processors in 40 percent and AMD in 32 percent.
  • The average concurrency level is 30,490 cores per system – up from 24,400 six month ago.

All changes are from June 2008 to November 2008. 

The TOP500 list is compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of NERSC/Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville.