Jan. 4, 2017
By: Michael Feldman
In June 2016, China leapfrogged the HPC competition with its 93-petaflop Sunway TaihuLight supercomputer. Then in November it reached parity with the US on the TOP500 list in total number of systems and aggregate performance. But China’s supercomputing capabilities are in many respects still a work in progress.
China’s newly won respect in high performance computing is largely based on the top petascale machinery the country has been churning out over the past five years. TaihuLight, Tianhe-2, Tianhe-1A, and Nebulae have, at various time, held the number one or number two spots in the TOP500 rankings. The first two, TaihuLight and Tianhe-2, are currently the top two systems on the list. That’s a notable achievement for a country whose most powerful supercomputer just a decade didn’t crack the top 50.
However, much of China’s current supercomputing capability is concentrated in these top two systems. Together they represent more than half of the aggregate Linpack performance of all the country’s TOP500 systems -- 126.9 petaflops out of 223.6 petaflops. That’s because the majority of Chinese machines reside in the bottom half of the list. While both the US and China have the same number of supercomputers in the TOP500 (171), China’s median rank is 316, compared to 227 for the US.
More importantly, most of the Chinese systems in the list are probably not being used to run HPC workloads. Of the 171 total systems, 114 are generic x86/Ethernet clusters installed at internet companies, cloud service providers, telecom firms, and electricity companies. Another 25 or so are installed at unnamed government installations. A couple dozen of these generic systems are equipped with NVIDIA Tesla GPUs, suggesting they are being used for some sort HPC work (for example, neural net training), but the vast majority look like large clusters running web-based or back office applications. There are only a handful of systems installed at commercial sites, mainly in the financial services sector.
Liu Jun, Inspur's HPC general manager, admits as much. In a China Daily article published in November, he noted that most US-based supercomputers are installed at national labs, universities and research institutes. “But many of our supercomputers are just Internet data centers,” he said. “In the US, they are not considered supercomputers. Universities and research institutes in China need this infrastructure, but our investments for them are small, so much more work needs to be done.”
A more visible problem for the Chinese HPC community is their lack of software expertise in both the application realm and in system software. The latter is especially critical because China is devoted to using home-grown processors and interconnects (like the ShenWei 26010 chip that powers TaihuLight and the TH Express-2 network in Tianhe-2) for at least some of its top-flight supercomputers. It also makes developing applications and algorithms on non-standard hardware all the more difficult.
That didn’t prevent a team of developers from using TaihuLight to capture the Gordon Bell Prize in November. The developers were able to scale their weather research application to 10 million cores (8 petaflops) on the reigning TOP500 champ. It represented the first time a team from China had been awarded the Gordon Bell Prize, and it did so on highly customized hardware.
Not every Chinese supercomputer is the recipient of such devoted attention, however. A recent article in the South China Morning Post noted that powerful Tianhe-1A supercomputer located in Tianjin is not able to contribute very much to forecasting smog levels in Beijing and the northern parts of the country. Instead, meteorologists there rely on an older, much less powerful IBM Flex System p460 to perform the needed simulations. The article stated that according to state media reports, when Tianhe system was launched, it was supposed to provide forecasts for the entire mainland. (The machine is somewhat notorious for its lack of application support.) It’s noteworthy that Tianhe-1A is based on standard Intel Xeon processors and NVIDIA GPUs.
The larger problem for China, and one that is shared with other nations, is that there is no de facto standard platform for supercomputers going forward. According to James Lin, Vice Director of the HPC Center at Shanghai Jiao Tong University, three pre-exascale systems are scheduled to be deployed in China later this year, each based on a different design. The first is a National University of Defense Technology (NUDT) system using ARM processors of some kind, which will be deployed at the National Supercomputing Center of Tianjin. The second is a Sugon x86-based platform to be installed at the Shanghai Supercomputer Center and the National Supercomputing Center in Shenzhen. (The technology will supposedly be licensed from AMD). The third system will be powered by the next generation of the ShenWei processor and be built for the National Supercomputing Center in Jinan. In July, Lin tweeted that “the winner will be chosen to build the ‘exascale system’ in peak performance by 2020.”
In a sense, China is unlucky that it picked a time of great architectural flux to build up its supercomputing credentials. Another complication is that to some extent the US government forced China’s hand to develop some its technology domestically. In 2015, the US slapped export restrictions on China, blocking Intel and other chipmakers from selling high-end processors to certain government-run supercomputing sites that were suspected of using the technology to develop nuclear capabilities. Whether China ends up relying completely on home-grown designs or those licensed from ARM, AMD, OpenPOWER, or whoever, remains to be seen.
Ultimately though, China’s supercomputing prowess is more likely to be driven by the depth of its demand for HPC, rather than its ability to supply the technology. And in this regard, the country'a size and aggressive industrial policy provide some real advantages. A Wall Street Journal report published last month documents China’s ambitions to move to the forefront across a range of 21st industries – artificial intelligence, robotics/drones, internet infrastructure, and quantum communications, among others.
To accomplish this, China is ramping up its federal R&D spending significantly. Accord to the Journal report, funding for science in China rose to $10.1 billion in 2015, a five-fold increase from 2010. That trajectory enabled China to overtake Japan in 2009 and Europe in 2013. At the current pace, it is expected to eclipse US R&D spending by 2020.
Conveniently, that would be just in time for the first deployments of exascale supercomputers. But more significantly, world-leading R&D spending would jumpstart the fledging HPC vendor community in China. Lenovo, Sugon, Inspur and Huawei all have aspirations to bring their HPC products to a global market. But once demand for Chinese HPC takes off, these companies will vie with the likes of HPE and Dell for market leadership. For China, the best is yet to come.