Nvidia kicked off its second GTC conference of the year by announcing that its generation of H100 “Hopper” GPUs are in full production, with global partners planning to roll out products and services in October and wide availability in Q1 2023.
Hopper features a number of innovations over Ampere, its predecessor architecture introduced in 2020. Chief among them is the new Transformer engine. Transformers are widely used deep learning models and the standard model of choice for natural language processing. Nvidia claims the H100 Transformer Engine can speed up neural networks up to six times the Ampere without losing accuracy.
Hopper also comes with the second generation of Nvidia’s Secure Multi-Instance GPU (MIG) technology, allowing a single GPU to be partitioned into multiple secure partitions that operate independently and in isolation.
Also new is a feature called Confidential Computing, which protects AI models and customer data while it’s being processed, in addition to protecting it while it’s at rest and in transit on the network. Finally, Hopper features the fourth generation of NVLink, Nvidia’s high-speed interconnect technology that can connect up to 256 H100 GPUs with nine times the bandwidth of the previous generation.
And while GPUs aren’t known for their power efficiency, the H100 enables enterprises to deliver the same AI performance with 3.5x higher power efficiency and 3x lower total cost of ownership than the previous generation, because companies need 5 times fewer server nodes.
“Our customers are looking to deploy data centers that are essentially AI factories, producing AI for production use cases. And we’re excited to see what H100 will do for those customers, delivering more throughput, more capacity, and [continuing] to democratize AI everywhere,” said Ian Buck, vice president of hyperscale and HPC at Nvidia, during a media call with reporters.
Buck, who invented the CUDA language used to program Nvidia GPUs for HPC and other uses, said large language models (LLMs) will be one of the most important use cases for AI. for the H100.
Language models are tools trained to predict the next word in a sentence, like autocomplete on a phone or browser. LLM, as the name suggests, can predict entire sentences and do more like write essays, create graphs, and generate computer code.
“We’re seeing great language models being used for things outside of human language like coding, and helping software developers write software faster, more efficiently with fewer errors,” Buck said.
H100-powered systems from hardware manufacturers are expected to ship in the coming weeks, with more than 50 server models on the market by the end of the year and dozens more in the first half of 2023. Partners include Atos, Cisco, Dell, Fujitsu, Gigabyte, HPE, Lenovo and Supermicro.
Additionally, Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure say they will be among the first to deploy H100-based instances in the cloud starting next year.
If you want to test out the H100, it will be available for trial through Nvidia’s Launchpad, its try-before-you-buy service where users can log in and test out Nvidia hardware, including the H100.
Copyright © 2022 IDG Communications, Inc.