Highlights - November 2013

Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and to be deployed at the National Supercomputer Center in Guangzho, China remains the No. 1 system with 33.86 petaflop/s on the Linpack benchmark. The system currently has 16,000 nodes each with two Intel Xeon Ivy Bridge processors and three Xeon Phi processors for a combined total of 3,120,000 computing cores. It features a number of Chinese-developed components, including the TH Express-2 interconnect network, front-end processors, operating system, and software tools. The Tianhe-2 uses the Kylin Linux operating system. Officially approved for use in 2006, Kylin was developed by the National University for Defense Technology, is compatible with other mainstream operating systems, and supports multiple microprocessors and computers of different architectures. In addition, NUDT developed OpenMC, a directive-based, intra-node programming model, similar to Open-MP and either CUDA, OpenACC, or OpenCL. The Tianhe-2 has a front-end system composed of 4,096 Galaxy FT-1500 CPUs, designed and developed at NUDT. The FT-1500 is 16 cores and based on SparcV9. Its performance is 144 Gflop/s and each chip runs at 65 watts. By comparison the Intel Ivy Bridge has 12 cores with a peak performance of 211 Gflop/s. The power consumption of Tianhe-2 while running Linpack was 17.8 MW.

Other highlights from the Top 10:

  • Titan, a Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory remains the No. 2 system. It achieved 17.59 petaflop/s on the Linpack benchmark using 261,632 of its NVIDIA K20x accelerator cores. Titan is one of the most energy efficient systems on the list consuming a total of 8.21 MW and delivering 2.143 Gflops/W.
  • Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is again the No. 3 system. It was first delivered in 2011 and has achieved 17.17 petaflop/s on the Linpack benchmark using 1,572,864 cores.
  • Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is the No. 4 system with 10.51 Pflop/s on the Linpack benchmark using 705,024 SPARC64 processing cores.
  • Mira, a BlueGene/Q system installed at DOE’s Argonne National Laboratory, is No. 5 with 8.59 Pflop/s on the Linpack benchmark using 786,432 cores.
  • At No. 6 is Piz Daint, a Cray XC30 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland  and now the most powerful system in Europe. Piz Daint achieved 6.27 Pflop/s on the Linpack benchmark using 73,808 NVIDIA K20x accelerator cores. 
  • Piz Daint is also the most energy efficient systems in the TOP10 consuming a total of 2.33 MW and delivering 2.7 Gflops/W.
  • Stampede, a Dell PowerEdge C8220 system installed at the Texas Advanced Computing Center of the University of Texas, Austin, slipped to No. 7. It also uses Intel Xeon Phi processors (previously known as MIC) to achieve its 5.17 Pflop/s.
  • The second system in Europe dropped to No. 8. It is also a BlueGene/Q system called JUQEEN installed at the Forschungszentrum Juelich in Germany and is listed with 5.01 Pflop/s.
  • No. 9 is taken by Vulcan, another IBM BlueGene/Q system at Lawrence Livermore National Laboratory. It was temporarily combined with the No. 3 system but is now operated independently. It achieved 4.29 Pflop/s.
  • At No. 10 is the third system in Europe, the SuperMUC, an IBM iDataplex system with Intel Sandybridge processors installed at Leibniz Rechenzentrum in Germany with 2.9 Plop/s.

Highlights from the Overall List

  • There are 31 systems with performance greater than a petaflop/s on the list, up from 26 six months ago.
  • The No. 1 system, Tianhe-2, and the No. 7 system, Stampede, use Intel Xeon Phi processors to speed up their computational rate. The No. 2 system Titan and the No. 6 system Piz Daint are using NVIDIA GPUs to accelerate computation.
  • A total of 53 systems on the list are using accelerator/co-processor technology, unchanged from June 2013. Thirty-eight (38) of these use NVIDIA chips, two use ATI Radeon, and there are now 13 systems with Intel MIC technology (Xeon Phi).
  • Intel continues to provide the processors for the largest share (82.4 percent) of TOP500 systems.
  • Ninety-four percent of the systems use processors with six or more cores and 75 percent use eight or more cores.
  • IBM’s BlueGene/Q is still the most popular system in the TOP10 with four entries including the No. 3, 5, 8 and 9 systems.
  • The number of systems installed in China has now stabilized at 63, compared to 65 on the last list.  China occupies the No. 2 position as a user of HPC, ahead of Japan, UK, France, and Germany. Due to Tianhe-2, China this year also took the No. 2 position in the performance share, ahead of Japan.

General highlights from the TOP500 since the June 2013 edition:

  • The entry level to the list moved up to the 117.8 Tflop/s mark on the Linpack benchmark, compared to 96.6 Tflop/s six months ago.
  • The last system on the newest list was listed at position 363 in the previous TOP500.
  • Total combined performance of all 500 systems has grown to 250 Pflop/s, compared to 223 Pflop/s six months ago and 162 Pflop/s one year ago.
  • The entry point for the TOP100 increased in six months from 290 Tflop/s to 327 Tflop/s.
  • The average concurrency level in the TOP500 is 41,434 cores per system, up from 38,700 six months ago and 29,796 one year ago.

Vendor Trends

  • A total of 412 systems (82 percent) are now using Intel processors, slightly up from 80 percent six months ago.
  • Intel is followed by the AMD Opteron family with 43 systems (9 percent), slightly down from 10 percent on the previous list.
  • The share of IBM Power processors is at 40 systems (8 percent).
  • InfiniBand technology is now found on 207 systems, up from 203 systems, and is the most-used internal system interconnect technology. Gigabit Ethernet stayed at 212 systems down from 216 systems, in large part thanks to 77 systems now using 10G interfaces.
  • IBM and Hewlett-Packard continue to sell the bulk of the systems at all performance levels of the TOP500.
  • HP won the lead in systems and now has 195 systems (39 percent) compared to IBM with 166 systems (33 percent). HP is up from 189 systems (38 percent) six months ago, compared to IBM with 160 systems (32 percent) six months ago. In the system category, Cray remains third with 10 percent. 

Performance Trends

  • IBM remains the clear leader in the TOP500 list in performance and has a considerable lead with a 32 percent share of installed total performance (down from 33 percent).
  • Thanks to Tianhe-2 and Tianhe-1A, NUDT contributes 15 percent of the toal performance of the list, down from 16.8 percent.
  • Cray’s share in performance is now at 16.7 percent, up from 15.3 percent.
  • HP is now fourth, even though it increased its share to 15.3 percent from 14.1 percent.

Geographical observations

  • The U.S. is clearly the leading consumer of HPC systems with 265 of the 500 systems (253 last time). The European share (102 systems compared to 112 last time) is still lower than the Asian share (115 systems, down from 118 last time).
  • Dominant countries in Asia are China with 63 systems (down from 65) and Japan with 28 systems (down from 30).
  • In Europe, UK, France, and Germany, are almost equal with 23, 22, and 20 respectively.

About the TOP500 List

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

The TOP500 list is compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville.