Highlights - November 2014

Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China remains the No. 1 system with 33.86 petaflop/s (Pflop/s) on the Linpack benchmark. The system currently has 16,000 nodes, each with two Intel Xeon Ivy Bridge processors and three Xeon Phi processors for a combined total of 3,120,000 computing cores. It features a number of Chinese-developed components, including the TH Express-2 interconnect network, front-end processors, operating system and software tools. The Tianhe-2 uses the Kylin Linux operating system. The power consumption of Tianhe-2 while running Linpack was 17.8 MW.

Other highlights from the Top 10:

  • Titan, a Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory remains the No. 2 system. It achieved 17.59 Pflop/s on the Linpack benchmark using 261,632 of its NVIDIA K20x accelerator cores. Titan is one of the most energy efficient systems on the list consuming a total of 8.21 MW and delivering 2.143 Gflops/W.

  • Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is again the No. 3 system. It was first delivered in 2011 and has achieved 17.17 Pflop/s on the Linpack benchmark using 1,572,864 cores.

  • Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is the No. 4 system with 10.51 Pflop/s on the Linpack benchmark using 705,024 SPARC64 processing cores.

  • Mira, a BlueGene/Q system installed at DOE’s Argonne National Laboratory, is No. 5 with 8.59 Pflop/s on the Linpack benchmark using 786,432 cores.

  • At No. 6 is Piz Daint, a Cray XC30 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland and the most powerful system in Europe. Piz Daint achieved 6.27 Pflop/s on the Linpack benchmark using 73,808 NVIDIA K20x accelerator cores.  Piz Daint is also the most energy efficient systems in the TOP10 consuming a total of 2.33 MW and delivering 2.7 Gflops/W.

  • Stampede, a Dell PowerEdge C8220 system installed at the Texas Advanced Computing Center of the University of Texas, Austin, is at No. 7. It also uses Intel Xeon Phi processors (previously known as MIC) to achieve its 5.17 Pflop/s.

  • The second system in Europe is at No. 8. It is also a BlueGene/Q system called JUQEEN installed at the Forschungszentrum Juelich in Germany and is listed with 5.01 Pflop/s.

  • No. 9 is taken by Vulcan, another IBM BlueGene/Q system at Lawrence Livermore National Laboratory. It was temporarily combined with the No. 3 system but is now operated independently. It achieved 4.29 Pflop/s.

  • At No. 10 is the only new system in the Top 10, a Cray CS-Storm installed at a Government location in the USA with 3.57 Pflop/s.

  • The #10 system is the only Top 10 system installed in 2014.  The #1 system the only system installed in 2013.  The remaining 8 systems have been installed in 2012 or 2011.  This age of the population in the Top 10 is unprecedented.

Highlights from the Overall List

  • The overall list-by-list growth rates of performance continues to be at historical low values for the last 2 years.

  • The performance of the last system on the list (#500) has systematically lagged behind historical trends for the last 6 years and now clearly is on a different growth trajectory than before.  From 1994 to 2008 it grew by 90% per year. Since 2008 it only grows by 55% per year.

  • The growth of the average performance of all systems in the list has slowed as well but lagged only for the last two lists behind historical averages.  This average is noticeably influenced by the very large systems on the top of the list.  Recent installations of very large systems until June 2013 have counteracted the reduced growth rate at the bottom of the list.  This offers an indication that the market for the very largest systems might behaved differently from the market of mid-sized and smaller supercomputers.

  • There are 50 systems with performance greater than a Pflop/s on the list, up from 37 six months ago.

  • In the Top 10, the No. 1 system, Tianhe-2, and the No. 7 system, Stampede, use Intel Xeon Phi processors to speed up their computational rate. The No. 2 system Titan, the No. 6 system Piz Daint, and the new Cray CS-Storm system at #10 are using NVIDIA GPUs to accelerate computation.

  • A total of 75 systems on the list are using accelerator/co-processor technology, up from 62 from June 2014. Fifty (50) of these use NVIDIA chips, three use ATI Radeon, and there are now 25 systems with Intel MIC technology (Xeon Phi). Four Systems use a combination of Nvidia and Intel Xeon Phi acceleratores/co-processors.

  • The average number of accelerator cores for these 75 systems is 73,605 cores/system.

  • Intel continues to provide the processors for the largest share (85.8 percent) of TOP500 systems.

  • Ninety-six percent of the systems use processors with six or more cores, eighty-five percent use eight or more cores, and thirty-nine percent ten or more cores.

  • IBM’s BlueGene/Q is still the most popular system in the TOP 10 with four entries including the No. 3, 5, 8 and 9 systems.

  • The number of systems installed in the USA has fallen to 231 down from 233 six month ago.  This is near its lowest share ever seen before (226 in the early 2000s).

  • The number of systems installed in China has fallen to 61, compared to 76 on the last list.  China occupies the No. 2 position as a user of HPC, ahead of Japan, UK, France, and Germany.  Due to Tianhe-2, China is also holding the No. 2 position in the performance share, ahead of Japan.

General highlights from the TOP500 since the June 2014 edition:

  • The entry level to the list moved up to the 153.4 Tflop/s mark on the Linpack benchmark, compared to 133.7 Tflop/s six months ago.

  • The last system on the newest list was listed at position 422 in the previous TOP500.  This represents the lowest turnover rate in the list in two decades.

  • Total combined performance of all 500 systems has grown to 309 Pflop/s, compared to 274 Pflop/s six months ago and 250 Pflop/s one year ago. This increase in installed performance also exhibits a noticeable slowdown in growth compared to the previous long-term trend.

  • The entry point for the TOP100 increased in six months to 496 Tflop/s from 390 Tflop/s.

  • The average concurrency level in the TOP500 is 46,288 cores per system, up from 43,301 six months ago and 41,434 one year ago.

Vendor Trends

  • A total of 429 systems (85.8 percent) are now using Intel processors, slightly up from 85 percent six months ago.

  • The share of IBM Power processors is stable at 38 systems (8 percent).

  • The AMD Opteron family is used in 26 systems (5.2 percent), down from 6 percent on the previous list.

  • InfiniBand technology is now found on 225 systems, up from 221 systems, and is the most-used internal system interconnect technology. Gigabit Ethernet has fallen to 187 systems down from 202 systems, in large part thanks to 88 systems now using 10G interfaces.

  • IBM and Hewlett-Packard continue to sell the bulk of the systems at all performance levels of the TOP500.

  • HP has the lead in systems and now has 179 systems (36 percent) compared to IBM with 153 systems (30.6 percent). HP had 182 systems (36.4 percent) six months ago, and IBM had 176 systems (35.2 percent) six months ago. In the system category, Cray remains third with 12.4 percent (62 systems).

Performance Trends

  • IBM remains the clear leader in the TOP500 list in performance and has a considerable lead with a 28 percent share of installed total performance (down from 32 percent).

  • Thanks to Tianhe-2 and Tianhe-1A, NUDT contributes 12.7 percent of the total performance of the list, down from 13.7 percent.

  • Cray’s share in performance is now at 22 percent, up from 18.2 percent.

  • HP is again fourth, even though it increased its share to 15.6 percent from 15.3 percent.

Geographical observations

  • The U.S. is clearly the leading consumer of HPC systems with 231 of the 500 systems (233 last time) although its share has been dropping close to its all time low. The European share (130 systems compared to 116 last time) has surpassed the Asian share (120 systems, down from 132 last time).

  • Dominant countries in Asia are China with 61 systems (down from 76) and Japan with 32 systems (up from 30).  

  • In Europe, UK, France, and Germany, are almost equal with 30, 30, and 26 respectively.  

About the TOP500 List

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of Prometeus, Germany.