May 8, 2018
By: Michael Feldman
Cavium has released the ThunderX2 processor for general availability, paving the way for the first generation of ARM-powered high performance computing.
Although ThunderX2 sounds like it’s the successor to Cavium’s initial ThunderX processor, it’s actual origin is Broadcom’s Vulcan processor. In 2016, Cavium acquired the intellectual property associated with Vulcan and rebranded it as ThunderX2. Cavium soft-launched the processor in May of that year and have been collecting OEM and ODM partners ever since.
As it stands today, ThunderX2 is supported by a fairly broad array of server vendors, including OEMs such as Cray, HPE, Atos, and Penguin Computing. The initial ThunderX2 offerings consist of 40 different SKUs, ranging from 16 to 32 cores, with clock speeds up to 2.5 MHz – as much as 3.0 GHz in turbo mode. Multithreading is supported at various levels, with one, two or four threads per core. The company has positioned the chip family to go head-to-head against Intel’s newest Xeon Skylake products and want to compete across the entire application space, from HPC to cloud and enterprise computing.
Although Cavium hasn’t released specific performance numbers, the company maintains the processor has “core and socket level performance comparable to the highest end Xeon Skylake Platinum CPUs.” However, that’s not likely to be the case when you’re just talking about peak flops. The fastest Skylake chips will deliver about 2,000 gigaflops, while the ThunderX2 tops out at about 560 gigaflops.
That latter figure is based on the 72 teraflops of performance anticipated for each of the 64-node ThunderX2-based Apollo 70 clusters to be installed by HPE at three UK universities. Those systems are slated to get the top-end 32-core Cavium processors. A 32-node Apollo 70 cluster ordered by Argonne National Lab will be similarly configured.
Cray is set to deliver what will probably be the most powerful supercomputer to employ these new Cavium chips this year. That system, known as Isambard, is an XC50 supercomputer destined for the Great Western 4 (GW4) Alliance, a research consortium of Bristol, Bath, Cardiff and Exeter universities. The machine will be comprised of more than 10,000 ThunderX2 cores, so it’s probably going to deliver something on the order of 175 teraflops.
Pure number crunching is not really the ThunderX2’s forte though. Cavium designed the processor with a much greater focus on memory bandwidth and capacity. Its eight DDR4 memory controllers support up to 16 DIMMS per socket for a maximum of 2 TB, or 4 TB per dual-socket node. As a result, the chipmaker is claiming 33 percent higher memory bandwidth and memory capacity compared to the top-of-the-line Xeon Skylake chips.
That said, the real message that Cavium is pushing is about price-performance. In quantity, ThunderX2 chips list from $800 to $1,795, which is well under what Intel is charging for comparably performing Xeon processors. Cavium is claiming two to four times better performance per dollar compared to Intel’s latest silicon.
Besides the aforementioned installations in the UK and at Argonne, ThunderX2 servers will also be deployed by Sandia National Labs, the Mont-Blanc Project, and Microsoft (Azure). Presumably now that the Cavium chips are rolling out of the fabs, all these systems can be built and deployed. None of them, not even Isambard, will attain a spot on the TOP500 list though. But if these initial machines are successfully employed, they could lay the groundwork for much larger systems in the not-too-distant future.