Highlights - November 2021

This is the 58th edition of the TOP500.

The only new entry in the Top10 at No. 10 is a Microsoft Azure system called Voyager-EUS2 installed at Microsoft in the U.S. The machine achieved 30.05 Pflop/s on the HPL benchmark. This architecture is based on an AMD EPYC processor with 48 cores and 2.45GHz working together with an NVIDIA A100 GPU with 80 G.B. memory and utilizing a Mellanox HDR Infiniband for data transfer.

The only other change to the TOP10 was that the No. 5 system Perlmutter system at NERSC at the DOE Lawrence Berkeley National Laboratory improved its performance to 70.9 Pflop/s which, however, did not change its position on the list.

Supercomputer Fugaku, a system based on Fujitsu’s custom ARM A64FX processor, remains at No. 1. It is installed at the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan, the location of the former K-Computer. It was co-developed in close partnership by Riken and Fujitsu and uses Fujitsu’s Tofu D interconnect to transfer data between nodes. It improved its HPL benchmark score to 442 Pflop/s, easily exceeding the No. 2 Summit by 3x. In single or further reduced precision, which are often used in machine learning and A.I. applications, its peak performance is actually above 1,000 PFlop/s (= 1 Exaflop/s). Because of this, it is often introduced as the first ‘Exascale’ supercomputer. Fugaku already demonstrated this new level of performance on the new HPL-AI benchmark with 2 Exaflops! https://www.r-ccs.riken.jp/en/

The new HPE/Cray/AMD build Frontier system, currently being installed at the Oak Ridge National Laboratory, is widely expected to beat the Exa-scale barrier in full 64-bit floating point precision. However, it was not able to submit such a result before the deadline for this edition of the TOP500. Over the last year there were also reports about several Chinese systems reaching Exaflop level performance, however none of these systems submitted an HPL result to the TOP500.

Here a brief summary of the system in the Top10:

Fugaku remains the No. 1 system. It has 7,630,848 cores which allowed it to attain an HPL benchmark score of 442 Pflop/s. This puts it by 3x ahead of the No. 2 system in the list.
Summit, an IBM-built system at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, remains the fastest system in the U.S. at the No. 2 spot worldwide with a performance of 148.8 Pflop/s on the HPL benchmark, which is used to rank the TOP500 list. Summit has 4,356 nodes, each one housing two Power9 CPUs with 22 cores each and six NVIDIA Tesla V100 GPUs each with 80 streaming multiprocessors (SM). The nodes are linked together with a Mellanox dual-rail EDR InfiniBand network.
Sierra, a system at the Lawrence Livermore National Laboratory, CA, USA is at No. 3. It’s architecture is very similar to the new #2 systems Summit. It is built with 4,320 nodes with two Power9 CPUs and four NVIDIA Tesla V100 GPUs. Sierra achieved 94.6 Pflop/s.
Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) and installed at the National Supercomputing Center in Wuxi, which is in China's Jiangsu province is listed at the No. 4 position with 93 Pflop/s.
• Perlmutter at No. 5 was newly listed in the TOP10 in last June. It is based on the HPE Cray “Shasta” platform, and a heterogeneous system with AMD EPYC based nodes and 1536 NVIDIA A100 accelerated nodes. Perlmutter improved its performance to 70.9 Pflop/s

Rank	Site	System	Cores	Rmax (TFlop/s)	Rpeak (TFlop/s)	Power (kW)
1	RIKEN Center for Computational Science Japan	Supercomputer Fugaku - Supercomputer Fugaku, A64FX 48C 2.2GHz, Tofu interconnect D Fujitsu	7,630,848	442.01	537.21	29,899
2	DOE/SC/Oak Ridge National Laboratory United States	Summit - IBM Power System AC922, IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband IBM	2,414,592	148.60	200.79	10,096
3	DOE/NNSA/LLNL United States	Sierra - IBM Power System AC922, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband IBM / NVIDIA / Mellanox	1,572,480	94.64	125.71	7,438
4	National Supercomputing Center in Wuxi China	Sunway TaihuLight - Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway NRCPC	10,649,600	93.01	125.44	15,371
5	DOE/SC/LBNL/NERSC United States	Perlmutter - HPE Cray EX235n, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100 SXM4 40 GB, Slingshot-10 HPE	761,856	70.87	93.75	2,589
6	NVIDIA Corporation United States	Selene - NVIDIA DGX A100, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100, Mellanox HDR Infiniband Nvidia	555,520	63.46	79.22	2,646
7	National Super Computer Center in Guangzhou China	Tianhe-2A - TH-IVB-FEP Cluster, Intel Xeon E5-2692v2 12C 2.2GHz, TH Express-2, Matrix-2000 NUDT	4,981,760	61.44	100.68	18,482
8	Forschungszentrum Juelich (FZJ) Germany	JUWELS Booster Module - Bull Sequana XH2000 , AMD EPYC 7402 24C 2.8GHz, NVIDIA A100, Mellanox HDR InfiniBand/ParTec ParaStation ClusterSuite EVIDEN	449,280	44.12	70.98	1,764
9	Eni S.p.A. Italy	HPC5 - PowerEdge C4140, Xeon Gold 6252 24C 2.1GHz, NVIDIA Tesla V100, Mellanox HDR Infiniband DELL	669,760	35.45	51.72	2,252
10	Microsoft Azure United States	Voyager-EUS2 - ND96amsr_A100_v4, AMD EPYC 7V12 48C 2.45GHz, NVIDIA A100 80GB, Mellanox HDR Infiniband Microsoft Azure	253,440	30.05	39.53

Selene now at No. 6 is an NVIDIA DGX A100 SuperPOD installed inhouse at NVIDIA in the USA. The system is based on AMD EPYC processor with NVIDIA A100 for acceleration and a Mellanox HDR InfiniBand as network and achieved 63.4 Pflop/s.
Tianhe-2A (Milky Way-2A), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China is now listed as the No. 7 system with 61.4 Pflop/s.
A system called “JUWELS Booster Module” is the No. 8. The BullSequana system build by Atos is installed at the Forschungszentrum Juelich (FZJ) in Germany. The system uses AMD EPYC processor with NVIDIA A100 for acceleration and a Mellanox HDR InfiniBand as network similar to the Selene System. This system is the most powerful system in Europe with 44.1 Pflop/s.
HPC5 at No. 9 is a PowerEdge system build by Dell installed by the Italien company Eni S.p.A.. It achieves a performance of 35.5 Pflop/s due to using NVIDIA Tesla V100 as accelerators and a Mellanox HDR InfiniBand as network.
• Voyager-EUS2, a Microsoft Azure system installed at Microsoft in the U.S., is the only new system in the TOP10. It achieved 30.05 Pflop/s and is listed at No. 10. This architecture is based on an AMD EPYC processor with 48 cores and 2.45GHz working together with an NVIDIA A100 GPU with 80 G.B. memory and utilizing a Mellanox HDR Infiniband for data transfer.

Highlights from the List

A total of 151 systems on the list are using accelerator/co-processor technology, up from 147 six months ago. 0 of these use 18 chips, 43 use NVIDIA Ampere, and 84 systems with NVIDIA Volta.
Intel continues to provide the processors for the largest share (81.60 percent) of TOP500 systems, down from 86.40 % six months ago. 73 (14.60 %) of the systems in the current list used AMD processors, up from 9.60 % six months ago.
Supercomputer Fugaku maintains the leadership followed by the 2 top DOE systems Sierra and Summit in the #2 and #3 spots with respect to HPCG performance.
The entry level to the list moved up to the 1.65 Pflop/s mark on the Linpack benchmark.
The last system on the newest list was listed at position 433 in the previous TOP500.
Total combined performance of all 500 exceeded the Exaflop barrier with now 3.04 exaflop/s (Eflop/s) up from 2.79 exaflop/s (Eflop/s) 6 months ago.
The entry point for the TOP100 increased to 4.79 Pflop/s.
The average concurrency level in the TOP500 is 162,520 cores per system up from 153,852 six months ago.

General Trends

Installations by countries/regions:

HPC manufacturer:

Interconnect Technologies:

Processor Technologies:

Green500

The data collection and curation of the Green500 project has been integrated with the TOP500 project. This allows submissions of all data through a single webpage at http://top500.org/submit

Rank	TOP500 Rank	System	Cores	Rmax (TFlop/s)	Power (kW)	Energy Efficiency (GFlops/watts)
1	301	MN-3 - MN-Core Server, Xeon Platinum 8260M 24C 2.4GHz, Preferred Networks MN-Core, MN-Core DirectConnect , Preferred Networks Preferred Networks Japan	1,664	2.18	55	39.379
2	291	SSC-21 Scalable Module - Apollo 6500 Gen10 plus, AMD EPYC 7543 32C 2.8GHz, NVIDIA A100 80GB, Infiniband HDR200 , HPE Samsung Electronics South Korea	16,704	2.27	103	33.983
3	295	Tethys - NVIDIA DGX A100 Liquid Cooled Prototype, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100 80GB, Infiniband HDR , Nvidia NVIDIA Corporation United States	19,840	2.25	72	31.538
4	280	Wilkes-3 - PowerEdge XE8545, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100 80GB, Infiniband HDR200 dual rail , DELL University of Cambridge United Kingdom	26,880	2.29	74	30.797
5	30	HiPerGator AI - NVIDIA DGX A100, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100, Infiniband HDR , Nvidia University of Florida United States	138,880	17.20	583	29.521
6	403	Snellius Phase 1 GPU - ThinkSystem SD650-N V2, Xeon Platinum 8360Y 36C 2.4GHz, NVIDIA A100 SXM4 40 GB, Infiniband HDR , Lenovo SURF Netherlands	6,480	1.82	63	29.046
7	5	Perlmutter - HPE Cray EX235n, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100 SXM4 40 GB, Slingshot-10 , HPE DOE/SC/LBNL/NERSC United States	761,856	70.87	2,589	27.374
8	71	Karolina, GPU partition - Apollo 6500, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100 SXM4 40 GB, Infiniband HDR200 , HPE IT4Innovations National Supercomputing Center, VSB-Technical University of Ostrava Czechia	71,424	6.75	311	27.213
9	45	MeluXina - Accelerator Module - BullSequana XH2000, AMD EPYC 7452 32C 2.35GHz, NVIDIA A100 40GB, Mellanox HDR InfiniBand/ParTec ParaStation ClusterSuite , EVIDEN LuxProvide Luxembourg	99,200	10.52	390	26.957
10	262	NVIDIA DGX SuperPOD - NVIDIA DGX A100, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100, Mellanox HDR Infiniband , Nvidia NVIDIA Corporation United States	19,840	2.36	90	26.195

The system to claim the No. 1 spot for the Green500 was MN-3 from Preferred Networks in Japan. Relying on the MN-Core chip and an accelerator optimized for matrix arithmetic, this machine was able to achieve an incredible 39.38 gigaflops/watt power-efficiency. This machine provided a performance 29.7- gigaflops/watt on the last list, clearly showcasing some impressive improvement. It also enhanced its standing on the TOP500 list, moving from No. 337 to No. 302.
The new SSC-21 Scalable Module an HPE Apollo 6500 system installed at Samsung Electronics in South Korea achieved an impressive 33.98 gigaflops/watt. They did so by submitting an power optimized run of the HPL benchmark. It is listed at position 292 in the TOP500.
NVIDIA installed a new liquid cooled DGX A100 prototype system called Tethys. With a power optimized HPL run Tethys achieved 31.5 gigaflops/watt and garne red the No. 3 spot on the Green500. It is listed at position 296 in the TOP500.

HPCG Results

The Top500 list now includes the High-Performance Conjugate Gradient (HPCG) Benchmark results.

Rank	TOP500 Rank	System	Cores	Rmax (TFlop/s)	HPCG (TFlop/s)
1	1	Supercomputer Fugaku - Supercomputer Fugaku, A64FX 48C 2.2GHz, Tofu interconnect D , RIKEN Center for Computational Science Japan	7,630,848	442.01	16004.50
2	2	Summit - IBM Power System AC922, IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband , DOE/SC/Oak Ridge National Laboratory United States	2,414,592	148.60	2925.75
3	5	Perlmutter - HPE Cray EX235n, AMD EPYC 7763 64C 2.45GHz, NVIDIA A100 SXM4 40 GB, Slingshot-10 , DOE/SC/LBNL/NERSC United States	761,856	70.87	1905.44
4	3	Sierra - IBM Power System AC922, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband , DOE/NNSA/LLNL United States	1,572,480	94.64	1795.67
5	6	Selene - NVIDIA DGX A100, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100, Mellanox HDR Infiniband , NVIDIA Corporation United States	555,520	63.46	1622.51
6	8	JUWELS Booster Module - Bull Sequana XH2000 , AMD EPYC 7402 24C 2.8GHz, NVIDIA A100, Mellanox HDR InfiniBand/ParTec ParaStation ClusterSuite , Forschungszentrum Juelich (FZJ) Germany	449,280	44.12	1275.36
7	15	Dammam-7 - Cray CS-Storm, Xeon Gold 6248 20C 2.5GHz, NVIDIA Tesla V100 SXM2, InfiniBand HDR 100 , Saudi Aramco Saudi Arabia	672,520	22.40	881.40
8	9	HPC5 - PowerEdge C4140, Xeon Gold 6252 24C 2.1GHz, NVIDIA Tesla V100, Mellanox HDR Infiniband , Eni S.p.A. Italy	669,760	35.45	860.32
9	17	Wisteria/BDEC-01 (Odyssey) - PRIMEHPC FX1000, A64FX 48C 2.2GHz, Tofu interconnect D , Information Technology Center, The University of Tokyo Japan	368,640	22.12	817.58
10	48	Earth Simulator -SX-Aurora TSUBASA - SX-Aurora TSUBASA B401-8, Vector Engine Type20B 8C 1.6GHz, Infiniband HDR200 , Japan Agency for Marine-Earth Science and Technology Japan	43,776	9.99	747.80

Supercomputer Fugaku remains the leader on the HPCG benchmark with 16 PFlop/s.
The DOE systems Summit at ORNL remains at second positions with 2.93 HPCG-Pflop/s.
The third position was captured by the new Perlmutter system at NERSC/LBNL with 1.91 HPCG-Pflop/s.

About the TOP500 List

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be onto something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

Highlights - November 2021

Highlights from the List

General Trends

Installations by countries/regions:

HPC manufacturer:

Interconnect Technologies:

Processor Technologies:

Green500

HPCG Results

About the TOP500 List

Current List

25 Year Anniversary

Highlights - November 2021

Highlights from the List

General Trends

Installations by countries/regions:

HPC manufacturer:

Interconnect Technologies:

Processor Technologies:

Green500

HPCG Results

About the TOP500 List

Current List

25 Year Anniversary

Newsletter Signup