![None](/media/filer_public_thumbnails/filer_public/37/4c/374c352a-cd82-48c8-89f8-15aa79980072/uploadsaws-logo.png__250x250_q85_subsampling-2.png)
Oct. 3, 2016
By: Michael Feldman
Amazon Web Services (AWS) has announced the availability of a new GPU computing instance based on NVIDIA’s Tesla K80 devices. With up to eight K80s per instance, the web giant has leapfrogged its cloud competition with the most GPU-dense configurations in a public cloud offering.
The addition of the K80 instance, known as P2, was done to shore up the high end of Amazon’s Amazon Elastic Compute Cloud (EC2) portfolio. The EC2 upgrade is an indication that demand for GPU capabilities for HPC, machine learning, and data analytics jobs is growing. AWS lists artificial intelligence, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, genomics, and rendering as probable use cases for their new instance.
The P2 is the first new EC2 instance aimed at GPU-accelerated computing since Amazon introduced the CG1 instance back in 2010. The CG1 is powered by the now-ancient M2050 GPUs, from NVIDIA’s Fermi generation. EC2’s other active GPU instance, the G2, is based on an NVIDIA GRID product, the GRID K520 GPU, and targeted to cloud-based graphics applications such as virtual desktop virtualization, rendering, and other video encoding tasks. The GRID K520 can tackle traditional HPC simulation and modeling jobs as well, but since it’s a single precision processor with no memory error correction, its utility for HPC workloads is fairly limited.
The P2 instances come in three configurations of 1, 8, and 16 GPUs (with 2 GPUs per physical K80), paired with 4, 32, and 64 virtual CPUs, respectively. The CPU hardware in this case is based on Intel’s Xeon E5-2686 v4 “Broadwell” processor. At the top end configuration, the 16 GPUs are fed by 732 GB of host memory, 192 GB of GPU memory, and 20 Gbps of EC2’s home-grown networking. Memory capacity and networking speeds scale down proportionally with the smaller GPU configurations.
Although this is a big step up for EC2, the P2 instance does not offer NVIDIA’s newest big server silicon, which would be the Pascal-generation P100 or perhaps P40 GPUs. Instead Amazon opted for the Kepler-generation K80. With the K80, cloud users will get access to plenty of single and double precision flops: 8.7 SP teraflops and 2.9 DP teraflops per K80, respectively. That a far cry from the P100, which provides 4.7 DP teraflops, 9.3 SP teraflops, and 18.8 half-precision teraflops, the latter included to optimize deep learning applications. Also, thanks to the P100’s high bandwidth memory, the Pascal device enjoys nearly twice the bandwidth of products like the K80, which are outfitted with GDDR5 memory.
Such is the nature of public cloud infrastructure, where price-performance considerations tend to drive deployment decisions. No one knows what Amazon paid for the K80 modules, but we can be fairly certain that whatever deal they got, the FLOPS per dollar number was better than they could have achieved with the newer, more expensive Pascal products.
Amazon is not the only cloud provider offering GPU computing. NVIDIA currently lists nine providers, including some of the other big players like Microsoft (Azure) and IBM (Softlayer). Both Azure and Softlayer offer K80 instances in their respective clouds, although not in such GPU-heavy configurations as EC2’s P2. Also, for the time being, access to Azure K80s is still limited to preview customers. For those with more modest GPU compute needs, Penguin’s on-demand service and Alibaba’s cloud offer access to NVIDIA’s K40 GPU, a less powerful version of the K80.
Given its dominant position in the public cloud market, Amazon’s K80 upgrade should be well-received by its customers – both legacy GPU users looking to move up to more powerful hardware, and new customers who want to give GPUs a try, but don’t want to bother building specialized infrastructure in-house. The AWS press release describes a number of user endorsements across the HPC-machine learning-analytics application spectrum, including Clarifai for image/video recognition, Altair engineering for CFD, MapD for interactive SQL, and Sonus for real-time communications.
It’s not known to how many K80 servers AWS has initially deployed, but for the time being, access to the hardware is limited to certain geographies. At this point, that includes the US East (N. Virginia), US West (Oregon), and EU (Ireland) Regions. Presumably, the overall level of demand for these initial instances, and the distribution of use cases will determine Amazon’s next move.