News

HPC Power Compiler, Accelerator Support in the Works

None
June 13, 2016

By: Michael Feldman

OpenACC, the accelerator standards body, announced some new developments this week that should make GPU computing aficionados happy.  Perhaps the most significant news is that a new OpenACC-supported compiler, which will support Power-based HPC clusters equipped with NVIDIA’s Tesla GPUs, is nearing its commercial release. The compiler, which is under development by the Portland Group (PGI), is undergoing testing now with some well-connected customers, and will be released as a public beta in August.

The other OpenACC news has mostly to do with organizational dealings, including the addition of three new members: the University of Illinois at Urbana-Champaign (UIUC), Stony Brook University, and the Brookhaven National Laboratory. They join 20 other organizations that have already signed on to supporting the OpenACC standard. There was also an update on hackathon schedules, training, and workshops, the details of which are all provided on the OpenACC website.

There was also a nice little success story with NekCEM, a computational electromagnetics package, which was ported to GPUs using OpenACC. The developers claim they achieved a 2.5x speedup compared to a highly tuned version using only CPUs, further claiming that a GPU used just 39 percent of the energy needed by 16 CPUs for the equivalent computation performed in the same amount of time.

According to OpenACC president Duncan Poole, whose day job is the Director of Platform Alliances for Accelerated Computing at NVIDIA, the expanding membership and application set is an example of the growing acceptance of the standard and its success. “The user community has taken on a life of its own,” he said.

Getting back to the upcoming OpenACC Power compiler, NVIDIA and PGI announced its intention to provide such a capability more than a year and a half ago as part of the greater effort to support the OpenPower consortium. The goal is to allow developers to easily retarget Power-based GPU clusters with just a simple recompile of existing source code. According to PGI lead engineer Michael Wolfe, the Power compiler will have feature parity with its other compilers, supporting OpenMP for CPUs and OpenACC for GPUs, along with CUDA Fortran. It uses a PGI front-end and an LLVM back-end for code generation.

The driving force behind development of the Power compiler is the CORAL effort for the US Department of Energy. Funding for CORAL, which stands for Collaboration of Oak Ridge, Argonne, and Lawrence Livermore, will be used to build and install a series of pre-exascale systems at the three national labs. Two of these systems, Summit and Sierra, which are headed to Oak Ridge and Lawrence Livermore, respectively, will be IBM Power9-based machines, accelerated by NVIDIA Volta GPUs. According to Wolfe, the OpenACC Power compiler was a line item in that particular contract. IBM and NVIDIA, of course, are hoping Power-GPU clusters, like the CORAL supercomputers, become more commonplace, both before and after the systems come online in 2018.

Wolfe said PGI would also like to support Intel’s manycore Xeon Phi as an OpenACC target, but they have not resourced that effort. Before PGI was acquired by NVIDIA in 2013, the compiler-maker was developing such a port using “Knights Corner” Xeon Phi processors on loan from Intel. After the acquisition, Intel took back the processors and the project was put in limbo. Even as late as last year, a Xeon Phi port was on the PGI roadmap and was scheduled to debut in 2016, but apparently the effort was abandoned.

When OpenACC was launched in 2011 with the support of NVIDIA, Cray, PGI, and CAPS, it was in line to become an industry standard for accelerators. But Intel favored the emerging OpenMP accelerator model, which it had a hand in developing and which, not surprisingly, was a better fit with its manycore Xeon Phi architecture. When OpenMP accelerator support was first released in 2013, it was hoped OpenACC would be folded into the OpenMP accelerator specification. But the two models had some fundamental differences in technical approaches that were never reconciled and which were undoubtedly exacerbated by the competitive relationship between Intel and NVIDIA.

Wolfe says there are a number of people on both standards committees who are active members. But according to him, it would take a concerted effort to get everyone to agree to a unified standard. “There’s no active discussion about merging them,” admitted Wolfe.