By: Matthew Ziegler, Director Lenovo Neptune® & Sustainability, Lenovo
A look back to supercomputing at the turn of the century
When I first attended the Supercomputing (SC) conferences back in the early 2000s as an IBMer working in High Performance Computing (HPC), it was obvious this conference was intended for serious computer science researchers and industries singularly focused on pushing the boundaries of computing. Linux was still in its infancy. I vividly remember having to re-compile kernels with newly released drivers every time there was a new server that came to market just so I could get the system to PXE boot over the network. But there was one thing that was obvious to everyone who attended SC back then. HPC was pushing the envelope in all aspects of computing.
People weren’t afraid to experiment with new tech if it meant it would drive the cost of compute down and perform well. Open-source software and x86-based computing systems were being adopted at a rapid pace. Supercomputing wasn’t just reserved for the largest government institutions and industry sectors that could afford to house and maintain large basketball court-sized systems that cost upwards of tens of millions of dollars. New standards, consortia and trade groups formed. Supercomputers were being stood up and put into production all over the world.
The reason I have been thinking about the 2000s lately is because that same feeling I had going to SC then is how I felt being at SC24 in Atlanta. Back in the day, booths were occupied by vendors and startup companies showing off everything from networking, PCIe devices, Linux cluster management software, and even new hardware vendors taking advantage of the x86 architecture standard. I’ve always felt HPC has been a proving ground for tech, and seeing all these new technologies being introduced at SC solidified HPC as the place where innovation happens. In Atlanta this past November, when I first walked the show floor, I was astonished to see just how many non-traditional vendors were there. Companies showing off their new cooling fluids; hose and quick disconnect manufacturers; manifolds; and immersion cooling. I concluded that liquid cooling isn’t just a fun science experiment anymore. It’s now mainstream for those within and on the periphery of HPC.
I know the big topic these days is Artificial Intelligence (AI). However, let me stick with why seeing companies and vendors that specialize in liquid cooling really stuck out to me. I was with IBM back in 2012 when we deployed the first x86-based system that leveraged liquid cooling directly on the chip at Leibniz Supercomputing Centre (LRZ). Back then, the primary reason to move to liquid cooling was energy efficiency. Fans and making chilled water and/or air consumes power that is not used for computational tasks. We used warm (50°C) inlet water, cold plating directly on the silicon to remove the reliance on air flow and lowered the ambient data center temperature. We succeeded in deploying the most energy efficient supercomputer which debuted at #4 on the TOP500 at the time.
HPC-ers of the 2000s pushed the envelope of supercomputing
However, let’s talk about what didn’t happen back then but is feasible today. Though the efficiencies of liquid cooling were obvious, it did not just suddenly go mainstream, even for supercomputers. But why? If you could reduce the energy inefficiencies that come with air cooling such as the need for large air conditioning, air handling units, and the use of cold water to maintain lower ambient air temperatures in the data center, why wouldn’t you? The reason is simple. Liquid cooling wasn’t required and implementing it required higher capital expenses than traditional air-cooled systems. Air-cooled systems are easy to deploy, maintain, and are well suited to a multi-vendor strategy. Conversely, liquid cooling requires specialized equipment such as Cooling Distribution Units (CDUs), hoses, and the need to maintain fluid chemistry at optimal levels. But today, it is feasible as the costs of deploying liquid cooling outweigh the costs (monetarily and environmentally) of not doing so. The HPC community took supercomputing of the 2000s and pushed the envelope even further to what it is today.
The past couple years, I have been focusing mostly on liquid cooling at Lenovo. Lenovo’s Neptune® liquid cooling technology can be found in almost all of our server lines and is made to align with our customers’ transformation from less efficient air cooling to more energy efficient liquid cooling. This focus on liquid cooling had an interesting effect. Let me explain:
Being in HPC means we are always looking out for and researching the next great thing that will push the boundaries of tech to squeeze out better time to results. The effect of not sticking to one solution is that you are free to pursue options. That’s how HPC stays at the forefront of technology. However, by focusing on Lenovo Neptune® technology, I am looking for where liquid cooling will be needed next.
I found that I was not just talking to HPC users anymore about liquid cooling. Now, I am talking to the enterprise sector such as healthcare and retail organizations. HPC needs liquid cooling because those systems live at the top end of CPU and GPU SKUs, memory configurations and high-speed networking. Typically, the mainstream enterprise sector operated at the mid to lower end and usually had modest computing needs. Fast forward to the present and even just a few years into the future, those mid to lower end configurations now have challenges operating with only air-cooling due to lower Tcases, power, and component density. Liquid cooling has now become mainstream.
Walking the floor of SC24 and talking to fellow members of the HPC community, it really stood out to me that over the past two years, liquid cooling hasn’t been a transformation that facilities had to only think about implementing “one of these days”. The time is now, we are here today, it is already mainstream. All those vendors showing their wares on the floor that one would typically see at an HVAC convention proves it.
Just like in the 2000s, the HPC community can claim yet again that we are trendsetters and innovators credited for pushing technology forward in ways that the rest of the industry and enterprises have and will benefit from. Yes, even the enterprises deploying liquid-cooled systems for AI can thank the HPC sector.