Nov. 8, 2016
By: Michael Feldman
The prospect of FPGA-powered supercomputing has never looked brighter. The availability of more performant chips, the maturation of the OpenCL toolchain, the acquisition of Altera by Intel, and the world’s largest deployment of FPGAs in the datacenter by Microsoft, suggest that reconfigurable computing may finally fulfill its promise as a major technology for high performance computing.
For decades, this technology has been touted as a nimble and highly energy-efficient architecture for throughput computing. The ability of an FPGA to be reconfigured enables it to essentially become a soft ASIC optimized to run a specific workload. The thing that has held the technology back is programmability, which has until recently relied on low-level tools that were beyond the capabilities of the average application developer.
The current situation harkens back to GPGPUs ten years ago, when NVIDIA embarked on its CUDA architecture. As recently as a couple of years ago, the biggest FPGA players, Altera and Xilinx, were only marginally interested in pushing their wares into the datacenter. A number of small FPGA system providers for the HPC market -- Bittware, Gidel, Nallatech, DRC, and a few others – struggled to move the technology into the mainstream, but were too tiny to develop any critical mass. That has changed dramatically over the last two years.
In particular, the involvement by Intel (the world’s largest chipmaker) and Microsoft (the world’s largest software provider) is a sign that these chips are likely going to be accelerating a lot more workloads in the datacenter in the near future. These include application areas such as machine learning, encryption/decryption, data compression, network acceleration, and scientific computing, among others. Widening the application aperture will spur a more complete and robust toolset for FPGAs, which, in turn, will encourage their adoption by even more users, including those running traditional HPC codes.
With the backing of these two major players, the software and hardware ecosystem for this technology should ramp up significantly over the next few years. For its part, Microsoft has already deployed Altera FPGAs across its Azure cloud to accelerate deep learning applications in web-based image and voice recognition, search, and a variety of other services. Thanks to that effort, the company says it now has an exaflops worth of performance in Azure, which would easily make it the world’s largest deployment of server-based FPGAs.
Baidu seems to be following in Microsoft’s footsteps with its own FPGA build-out. In October, the company revealed it was using Xilinx FPGAs to accelerate machine learning applications running in its cloud. The exact size of the current deployment is unknown at this point, but if it’s anything approaching the scale of what Microsoft has done, one could argue they just kick-started a new market.
Xilinx, by the way, has also hooked up with IBM as part of the OpenPOWER alliance to become Big Blue’s go-to provider of FPGA hardware. A year ago, they formed a multi-year strategic partnership to bring the Power and Xilinx FPGA ecosystems together. Part of that partnership will tap into IBM’s system software group, which will be used to fill out the system software and middleware stack for FPGAs. IBM will also be designing Power-based motherboards equipped with Xilinx chips. Target applications include machine learning, network functions virtualization (NFV), genomics, high performance computing, and big data analytics.
Meanwhile Intel is planning to start shipping Xeon CPUs paired with FPGA logic as early as this quarter. In October, the company launched Stratix 10, its first FPGA derived from Altera designs. Stratix 10 offers the more conventional ARM-FPGA integration, as well as additional circuitry that provides 10 teraflops (single precision) of number-crunching on-chip. Intel has also taken up the cause of OpenCL for FPGAs, and will undoubtedly be devoting considerable resources to a more complete developer toolset and system software stack for these devices as they roll out more products. Intel execs have stated they expect FPGAs to be used in 30 percent of datacenter servers by 2020.
If hyperscale cloud providers are the first big users of reconfigurable computing, HPC users are likely to be the second. That wouldn’t be apparent if you looked at deployment today. Currently, there is not a single machine on the TOP500 list equipped with FPGAs. Use of this technology in traditional supercomputing is being done on a limited basis in areas like financial portfolio analysis and genome analysis.
At this week’s supercomputing conference (SC16) in Salt Lake City, the future of FPGAs will be on display throughout the week. On Monday morning, the Second International Workshop on Heterogeneous Computing with Reconfigurable Logic will be held. The workshop will feature Xilinx’s Michaela Blott along with some FPGA practitioners, who will talk about of the latest advances in FPGA technology for the high performance crowd. On Tuesday, a birds-of-a-feather (BoF) session titled Reconfigurable Supercomputing will provide a discussion on different architectures and development tools. And on Thursday, another BoF, Use Cases of Reconfigurable Computing Architectures for HPC, will present some FPGA proof points, as well as challenges hindering more widespread adoption. That session will be hosted by Intel’s Marie-Christine Sawley and Hans-Christian Hoppe and Berkeley Lab’s John Shalf. There is also a tutorial about Altera’s software development kit for OpenCL, which will also highlight some use FPGA uses cases, as well as discuss coding techniques for the application programmer.
There’s a decent chance we’ll see an announcement or two at SC16 regarding this topic, so stay tuned to TOP500 News and our This Week in HPC podcast for updates.