March 6, 2017
By: Michael Feldman
Japanese computer maker Fujitsu has announced it will build a deep learning supercomputer for RIKEN that will be used to spur research and development of AI technology. The new machine, which is schedule to go into operation in April, will be a blend of NVIDIA DGX-1 and Fujitsu PRIMERGY RX2530 M2 servers.
The yet-to-be-named system will be used at the Center for Advanced Intelligence Project (AIP), a group established by RIKEN in 2016 that specializes in R&D related to AI, big data, IoT and cybersecurity. The mission statement summarizes their work as follow: “Our center aims to achieve scientific breakthrough and to contribute to the welfare of society and humanity through developing innovative technologies. We also conduct research on ethical, legal and social issues caused by the spread of AI technology and develop human resources.”
Its intended user base will be AI researchers at universities and other institutions in Japan, as well as practitioners in the field in healthcare, manufacturing, and other commercial domains. Of particular interest are AI technologies that can help solve domestic issues of particular relevance to the Japanese, such as healthcare in aging populations, response strategies to natural disasters, regenerative medicine, and robotics-based manufacturing.
According to Fujitsu, the new system will deliver four half-precision (16-bit floating point) petaflops, essentially all of which are derived from the DGX-1 servers. Each server houses eight Tesla P100 GPUs, representing 170 peak teraflops at half-precision. Fujitsu’s contribution will be integrating the 32 of the DGX-1 boxes with 24 of its own PRIMERGY RX2540 M2 servers, along with a Fujitsu-made storage system. The latter is made up of six PRIMERGY RX2540 M2 PC servers, which will run FEFS, a parallel file system developed by Fujistu. The storage itself will consist of eight ETERNUS DX200 S3 units, and one ETERNUS DX100 S3 unit.
This is the third supercomputer unveiled in Japan within the last six months that has been significantly influenced by AI requirements. The first, known as the AI Bridging Cloud Infrastructure (ABCI) was announced by the National Institute of Advanced Industrial Science and Technology (AIST) at SC16 in November. When completed in late 2017, this 130-petaflop (half-precision) system which will be used to help support commercial AI deployment in Japan. The second system, TSUBAME 3.0, will be Tokyo Tech’s attempt to bring a lot of AI capability into the next generation of this lineage. This system is expected to deliver 47 half-precision petaflops when installed later this summer.
Both ABCI and TSUBAME 3.0 will fulfill the role of a general-purpose supercomputer, running conventional HPC application alongside deep learning workloads. Unlike those two systems, the RIKEN machine, besides being quite a bit smaller, also looks to be completely devoted to running deep learning applications.