June 29, 2018
By: Michael Feldman
According to the two leading analyst firms covering the high performance computing market, the use of the cloud for HPC workloads is looking a lot more attractive to users these days.
At the ISC High Performance conference this week, Intersect360 Research and Hyperion Research presented their respective HPC market forecasts, both of which included some rather encouraging news for the HPC-in-the-cloud crowd.
Intersect360 offered the most upbeat assessment in this regard, noting that cloud spending by HPC customers grew by a whopping 44 percent from 2016 to 2017, calling it a “breakout year” for this product category. According to the company’s market data, that put cloud-based spending at around $1.1 billion for 2017. And even though that represents only about three percent of total HPC revenue for the year, it’s a high-water mark for cloud computing in this space.
.
Source: Intersect360 Research
To put that in perspective, Intersect360 had the entire HPC market during this period growing at a rather anemic 1.6 percent. While server spending was up by over seven percent and storage by less than three percent, overall growth was held back by decreased spending for networks, software, and non-cloud services.
The surge in cloud use must have been something of a surprise to the Intersect360 team, which as recently as last year was forecasting that spending wouldn’t break $1 billion until 2019. Now, with the new data in hand, they are projecting cloud computing to be the fastest-growing product category for the next five years, reaching nearly $3 billion by 2022.
Interesect360 noted big differences in the growth rates of the different cloud sub-categories. For example, spending on raw cycles grew only 5.6 percent from 2016 to 2017, while revenue for software-as-a-service (SaaS) increased by 125 percent.
The big jump in cloud spending was driven by a number of different factors, according the Intersect360 folks, including “increasing facilities costs for hosting HPC, maturation of application licensing models, increased availability of high-performance cloud resources, and a spike in requirements for machine learning applications.”
The Hyperion Research team didn’t offer specific cloud revenue spending or projections during their presentation at the ISC conference, but did note that 64 percent of HPC sites now run at least some of their work in public cloud. That’s up from just 13 percent in 2011. However, according to Hyperion these same sites used cloud resources for only seven to eight percent of their jobs, which suggests a lot of these users were tapping into the cloud for bursting once their in-house capacity filled up. In fact, the need for extra capacity was the most cited reason by these sites for running some of their applications in the cloud.
According to Hyperion, another important factor driving HPC use in the cloud these days is the demand for special hardware or software features. (It was the number one reason given by government sites.) This is likely to be especially true for users running machine learning applications, where the most recently minted GPUs, like NVIDIA’s V100, have a much higher capability for such workloads than previous models.
Speaking of which, one of the advantages of large public clouds, like AWS, Microsoft Azure, and Google Cloud, is that they are more likely to have newer hardware, in general, than the average HPC cluster. That’s because over the past decade or so, users have been keeping their in-house systems longer than in the past, a trend that is reflected on the TOP500, where the average length of time on the list has doubled in recent years compared to the historical average. This is probably the result of smaller performance increases between succeeding processor generations (due mainly to the demise of Dennard scaling) and a longer time period between those generations (a consequence of the slowdown in Moore’s Law). As a consequence, the incentive for HPC sites to replace their systems every three years is less than it used to be.
But for cloud providers who buy, deploy, and operate hardware at hyperscale, even marginally better price-performance and performance/watt number can be rationalized. The practical ramifications of this is that the two most likely places you’re likely to encounter the latest Intel Xeon “Skylake” CPUs or NVIDIA V100 GPUs in an HPC setup are the newest systems on the TOP500 and in public clouds.
That said, cloud cycles will continue to be more expensive than in-house cycles for HPC, since these systems tend to be fully utilized. That will put an upper limit on cloud usage for high performance computing, but as is apparent from the latest market data from Intersect360 and Hyperion, that upper limit is still to be determined.