June 9, 2016
By: Michael Feldman
There was a time when the only thing that the high performance computing industry paid attention to was FLOPS. Indeed, for most of the history of HPC, floating point operations per second was the one true metric, and only those machines that delivered them in the largest quantities were deemed to be true supercomputers. Performance, after all, is HPC’s middle name.
Virginia Tech computer science professor Wu-Chun Feng certainly remembers those days, since he was one of the pioneers of energy-efficient supercomputing when almost nobody cared about the topic. He introduced the notion of power-aware HPC at the 2001 Supercomputing Conference, demonstrating his MetaBlade system, a 24-processor bare-bones hunk of hardware that employed low-power (for their day) Transmeta processors. The demonstration generated a lot of interest from attendees that year, but to most of the community, such experiments were just a diversion from big iron performance.
In 2002, Feng refined the Metablade design and developed Green Destiny, a 240-node cluster that demonstrated what could be achieved with more sophisticated energy-efficient hardware and dense packaging. Green Destiny delivered just 160 gigaflops of performance, but drew only three to five kilowatts of power. From pure performance standpoint, it held little interest, but its FLOPS/watt numbers were as good as or better than most of the supercomputers of the day.
In subsequent years, energy efficiency continued to be a back-burner concern, but something had become apparent to even the most performance-obsessed observer. Although the fastest systems in world had continued on their upward performance climb, approximately doubling in performance every year, power consumption was also rising. Energy efficiency was increasing, but at a slower rate than performance.
For example, in 1997, the first terascale supercomputer, ASCI Red, provided 1.3 teraflops of peak performance and took 850 kilowatts to operate. Eleven years later in 2008, the first petaflop system, Roadrunner, delivered 1.7 peak petaflops, while drawing 2.4 megawatts. Performance, from a FLOPS standpoint, had increased a thousand-fold over this period, while energy efficiency had improved by less than half that amount. And Roadrunner was actually an outlier with regard to power usage, since it relied on the highly efficient PowerXCell-8i coprocessor for most of its FLOPS. More conventional supercomputers at that time were a good deal less efficient.
Architectural differences, especially choice of processor, can make a big difference with regards to power, but the trend was clear: bigger systems required more power, even when exotic accelerators were used to improve the FLOPS/watt profile. For those with an eye on their HPC datacenter power bill, this spelled disaster. The problem was that most users never paid those bills, at least not directly.
After his foray with Green Destiny, Feng decided a concerted effort was needed to raise the awareness of the power consumption issue with the entire HPC community. In 2006, he launched the Green500, a project whose mission is to promote power-aware supercomputing and track the fastest systems in the world from a performance per watt perspective.
Getting people to report their system’s power usage was a daunting challenge, since those measurements had to be taken while running a particular code. So just developing the rules to report power consumption was a significant challenge. “We tried to balance the need for accurate readings with the ease of getting those readings,” explained Feng.
Feng wanted to develop the perfect metric, the perfect methodology and the perfect benchmark. But fellow HPC whiz and TOP500 author Jack Dongarra told him there was no perfection to be had here. Dongarra advised him just to put his stake in the ground and be ready for any egg-throwing that might ensue.
The choice of Linpack as benchmark was fairly simple, inasmuch as Feng wanted to track the 500 fastest supercomputers using the same criteria used by the well-recognized TOP500 project. The metric of FLOPS/watt was also a fairly straightforward choice. The methodology was the tricky part, given the various ways to measure power consumption during system operation.
Connecting a power meter and reading the result seems fairly simple, but the question arises on where and how to connect the meter, and for how long. Feng developed the original guidelines that he felt would strike the right balance of accuracy with practicality. Those guidelines were later standardized into official run rules, after some careful tweaking, thanks to a collaborative effort between Green500, the Energy Efficient HPC Working Group (EE HPC WG), the TOP500, and Green Grid.
The first Green500 rankings were published in November 2007, in which an IBM Blue Gene/P system topped the list with a winning metric of 357 megaflops per watt. Fast forwarding to November 2015, a Japanese system known as Shoubu was awarded top honors, scoring 7,032 megaflops per watt. That works out to a 20-fold increase in energy efficiency over that eight-year stretch.
Neither of these systems, nor the vast majority of the most energy-efficient systems on the Green500, show up anywhere near top of the TOP500 list. In general, the greenest HPC machines are carefully designed and constructed to maximize performance in a given power envelope. Most are smallish, customized constructions that draw just tens of kilowatts of power. Typically, they fall in the bottom half of the TOP500 list.
But the fact that anyone is building such machines, not to mention going through the laborious process of benchmarking them, points to the fact that green supercomputing is getting a lot more attention than it did when Feng started his crusade. That is partially the result of his efforts with the Green500, but the growing interest in minimizing power consumption is also being spurred by the HPC community’s pursuit of exascale supercomputing.
The original goal of a 20 MW Linpack exaflop in 2020 seems like a long shot at this point, but the greater emphasis on energy efficiency over the last eight years by both vendors and users has reduced the exascale power consumption problem into something more manageable. The US Department of Energy’s FastForward and DesignForward exascale programs devoted a good deal of attention to energy-efficient computing, as will be the case in the follow-on PathForward program. In Europe, there are a number of exascale efforts in motion, one of which is the Mont-Blanc Project, which is heavily invested in energy-efficient supercomputing.
Based on the top Green500 systems, a 20 MW supercomputer that achieves a Linpack exaflop is not likely to be possible before 2022. If using the trend line of top-ranked TOP500 systems, the actual date is likely to be somewhat later. Also, there is some indication that the increases in energy efficiency are slowing, since the initial bump that was achieved with accelerators earlier in the decade is not likely to be repeated in the latter half of the decade.
To make matters worse, there are signs that Moore’s Law itself is slowing down, which would reduce future gains in energy efficiency in a fundamental way. Seven years after Roadrunner, the current top system, Tianhe-2, uses 17.8 megawatts to achieve a peak performance of 54.9 petaflops. That represents a 32-fold performance increase since Roadrunner, but only a 4-fold increase in energy efficiency.
It’s rather easy to play with the numbers and make projections look as dire or as rosy as one is inclined to do. But the fact that such considerations have made their way into nearly every discussion of the future of HPC, points to the new status that green supercomputing now enjoys.
In retrospect then, it should not really come as a surprise that the Green500 and TOP500 projects have joined forces, as it were, and unified reporting of the fastest systems in the world. From now on, there will be a single interface to report supercomputing performance and energy efficiency. The two lists themselves will remain separate and on separate websites, at least for the time being, but the submission rules have been integrated.
This will simplify the submission process considerably, but more importantly it will standardize the data set going forward so that the technology can be tracked and analyzed more accurately. TOP500 was tracking energy consumption on its own, but with a somewhat different methodology than that of the Green500. Now that the power measurement rules have been standardized, the two lists could theoretically be unified.
Feng will continue to manage the Green500 list and thinks the unification will improve the number of systems that get submitted with power consumption measurements. It also will help to realize his original goal of making green computing a mainstream concern for the entire HPC community. “I think the Green500 has contributed to the extent that people now realize that power and energy efficiency are first-order design constraints,” said Feng. “That’s really what I sought out to do.”