Aug. 3, 2017
By: Michael Feldman
CSRA, a system integrator and service company, has installed the second phase of the Biowulf supercomputer at the National Institutes of Health (NIH), more than doubling the system’s capacity.
Biowulf was built to serve biologists, medical researchers, and other life scientists associated with NIH projects. Those include research efforts in genomics, molecular biology, bioimage analysis, and structural biology, to name a few. The system hosts dozens of software packages that support these areas, as well as an array of scientific databases.
Biowulf, an HPE Apollo XL1x0r cluster, was initially installed in 2016, and currently sits at number 139 on the TOP500 list. Its peak performance of 1.23 petaflops yielded a Linpack mark of 991.6 teraflops. The phase 1 system is powered by Broadwell-generation Xeon processors, and uses Mellanox FDR as the system interconnect for both the compute nodes and the main storage array. Ethernet provides connectivity to the NIH wide area network, known as NIHnet, and the NFS storage. The system also provides 14 petabytes of GPFS storage, courtesy of Data Direct Neworks (DDN).
According to the CSRA press release, the Biowulf upgrade will include an additional 1,104 CPU nodes representing 1.2 peak petaflops of extra capacity, along with 72 GPU nodes, which were added to the existing 2,372-node cluster. If the Biowulf website has been updated correctly, those GPUs are NVIDIA K80s, with two per node. That would bring the GPU contribution alone to over 400 teraflops, and the upgraded cluster to 1.6 peak petaflops. With the inclusion of the 1,104-node addition, that brings the capacity of the entire system to 2.8 petaflops.
That’s a lot more computational horsepower than the NIH has ever commanded before. Curiously, the press release doesn’t include a quote from any NIH official on what all that extra capacity might be used for. The announcement does, however, offers this:
“The second stage of computing power announced today will enable NIH researchers to make important advances in biomedical fields. This field of research is deeply dependent on computation, such as whole-genome analysis of bacteria, simulation of pandemic spread, and analysis of human brain MRIs. Results from these analyses may enable new treatments for diseases including cancer, diabetes, heart conditions, infectious disease, and mental health.”
The lack of NIH input could reflect the uncertainty in the research that will be funded there over the next year. The Trump White House has called for a $1.7 billion reduction for FY2017 and a further decrease of $5.8 billion in FY2018, amounting to almost a 20 percent cutback for the agency. Congress doesn’t appear to be going along with these proposed reductions, however, and has come up with an omnibus agreement to increase spending by $2 billion for at least this fiscal year.
Regardless, the additional capacity in Biowulf will almost certainly fill up with workloads from life scientists who rely on the NIH for computational resources. The desire for the government to provide better healthcare, which drives much of this research, is growing, even in an era when the appetite for public spending is waning.
In a recent interview by the Washington Examiner, NIH Director Francis Collins noted that this type of research can return can return $8 to the economy for each dollar spent, notwithstanding its ability to improve peoples’ lives. “This is a really remarkable moment in terms of making rapid progress, whether you're talking about cancer, diabetes, Alzheimer's disease, rare diseases or common diseases,” said Collins. “We are at a particularly exciting moment, scientifically, in terms of the ability to make rapid progress.”