News

Fujitsu Switches Horses for Post-K Supercomputer, Will Ride ARM into Exascale

None
June 23, 2016

By: Michael Feldman

ARM has been something of stealth architecture in the battle to unseat the x86 as the dominant platform for high performance computing systems. That lower profile changed this week at the ISC 2016 conference, where Fujitsu announced it would develop an ARM processor for its Post-K exascale supercomputer. But the effort promises to have much a wider impact on the HPC landscape than just a single system.

The announcement was made at the ISC 2016 conference during one of the Vendor Showdown sessions by Toshiyuki Shimizu, who heads up Fujitsu’s Next Generation Technical Computing Unit. As part of his presentation, he revealed that their exascale supercomputer scheduled for completion in 2020, will be powered by 64-bit ARM processors that will be designed to run HPC applications. RIKEN, Japan’s largest and most prestigious scientific research institute, will be the recipient of the future system. The HPC processor design work will be done in collaboration with ARM Holdings, the firm that develops and licenses ARM intellectual property.


 

This Post-K system represents the fourth generation of Fujitsu’s “K” supercomputing line, which up until now was based on SPARC64 processors. The original K computer, installed five years ago, was an 8-petaflop supercomputer, which in 2011 was the number one system on the TOP500 list. It used the SPARC64 VIIIfx processor as the CPU, which was designed and developed specifically for supercomputing. The K computer was subsequently updated to over 11 petaflops, and is currently perched in the number five slot on the list.

Two later versions of that processor, the SPARC64 IXfx and the SPARC XIfx, were developed for the FX10 and FX100 supercomputer lines, respectively. It’s unclear how many total HPC systems Fujitsu built with these SPARC64 chips, but currently there are only seven on the TOP500 list, including the original K. Assuming these represent most or all the deployments, Fujitsu didn’t get much payback from those development efforts.

With that in mind, it’s not all that surprising that the company decided to forego the SPARC64 in favor of a more mainstream architecture. Fujitsu never joined the OpenPower consortium, so a Power CPU was not in the cards. The Post-K developers could have opted for x86, using both Xeons and Xeon Phi processors (or Tesla GPUs, for that matter), but Fujitsu already provides those options in its PRIMERGY line. PRIMERGY is the company’s HPC cluster offering, although these systems can get quite large, as in the 25-petaflop Oakforest-PACS supercomputer, which is scheduled to boot up later this year.

The Post-K system for RIKEN was always intended to be a more custom endeavor, not just in the processor department, but also with regard to the system network, which in this case will be Fujitsu’s Tofu interconnect that will be inherited from the FX10 and FX100. At some point, we assume the Post-K design will be productized (FX1000?), but it’s probably a little too early to be thinking about exascale systems that sell by the dozens.

In retrospect, selecting ARM for their exascale systems makes a good deal of sense. Fujitsu will be able to leverage the very large ecosystem that surrounds that architecture, while still being able to provide differentiation based on their implementation of the chip. The intent here is to build an ARM architecture geared to HPC workloads, with the level of floating point performance that that implies. That means the chip will offer much more powerful vector processing capabilities than are currently available in ARMv8, as well as other features associated with muscular architectures like the SPARC64.

One could speculate that ARM was also chosen for its energy-efficiency cred. Designing a manycore CPU, which the Post-K processor will almost certainly be, with a simpler RISC core as its base, is inherently more efficient than trying to do that with a more complex architecture like SPARC64. But by the time you add all the HPC goodies into the mix, the result is probably not going to be all that different than say, a Xeon Phi. Nevertheless, there is a great deal of experience in the ARM community on how to optimize performance per watt.

Significantly, Fujitsu is not alone in this effort. It will work with ARM Holdings to develop the HPC extensions to the ARMv8 specification, which presumably will be available to any licensee. There may be other vendors involved as well. While Fujitsu is spearheading this work, the company is characterizing its role as “the lead partner of the ARM HPC extension effort.”

Cavium, which makes ARM ThunderX server processors, seems to be quite excited to see this roadmap move forward. While at ISC, company representatives told TOP500 News that the HPC ARM effort will not only provide a common specification for the technology, but also a lot of the critical system software, which will be developed as a result of the related exascale project for RIKEN. And this being HPC, a lot of that software will end up in open source repositories for everyone to use.

Any new architecture that results from the ARMv8 upgrade won’t affect Cavium’s ThunderX2 chip, which is already in the pipeline, but maybe their 3rd-generation chip will be able to incorporate the new specification. Other ARM chip vendors with an eye on the HPC market – AppliedMicro and AMD come to mind – could also be interested in such projects.

At this point, the timeline for the HPC ARM specification is unknown, as are the specifics of its features and capabilities. But not for long. On August 22nd, at the Hot Chips Conference in Cupertino, California , all of these questions should be answered. Nigel Stephens, Lead ISA Architect and ARM Fellow, will present the new architecture in a talk titled, “ARMv8 - A Next Generation Vector Architecture for HPC.”

In any case, the HPC processor story just got a whole lot more interesting. Having a vanilla 64-bit ARM platform for generic datacenter work and technical codes that are easy to parallelize is one thing; providing a performance-oriented design in an architecture that can be licensed by anyone is a potential game-changer.