Powering Tomorrow's Cloud: AWS Sets a New Standard with Graviton4 and Trainium2 Chips

By Greg Tavarez, TMCnet Editor  |  December 08, 2023

"Silicon underpins every customer workload."

That quote by David Brown (News - Alert), Vice President of Compute and Networking at AWS, is not one you see every day.

Yet, it does make sense.

Silicon is a key component in the manufacturing of computer chips and processors. Therefore, it implies that silicon plays a crucial role in the functionality and performance of the hardware that handles a wide range of tasks or workloads for customers. That explains why Brown and the team at AWS focus their chip designs on real workloads that matter to customers.

AWS wants to deliver advancements in price performance and energy efficiency for a broad range of customer workloads, including machine learning training and generative AI applications, when it comes to its chips. And it has done just that with the next generation of two AWS-designed chip families, AWS Graviton4 and AWS Trainium2.

AWS provides over 150 Graviton-powered Amazon EC2 instance types globally, boasting more than 2 million Graviton processors. With over 50,000 customers, including top EC2 users, Graviton-based instances are favored for optimal price performance.

As customers move larger workloads to the cloud, Graviton4 processors offer a 30% boost in compute performance, 50% more cores and 75% more memory bandwidth compared to Graviton3. Security is enhanced with full encryption of high-speed hardware interfaces. Graviton brings its advantages to users of Amazon Aurora, ElastiCache, EMR and more.

Available in memory-optimized Amazon EC2 R8g instances, Graviton4 improves high-performance databases, in-memory caches and big data analytics. R8g instances feature up to 3-times more vCPUs and memory than R7g instances, allowing for larger data processing, improved scalability, quicker results and reduced total cost of ownership. Graviton4-powered R8g instances are currently in preview, with general availability expected in the coming months.

Now to talk about Trainium2.

Today’s advanced generative AI models, like FMs and LLMs, are trained on massive datasets to allow users to transform user experiences by generating diverse content such as text, audio, images, video and software code. These models, ranging from hundreds of billions to trillions of parameters, demand high-performance computing across tens of thousands of ML chips.

Trainium2 is purpose-built for high-performance training of FMs and LLMs with trillions of parameters. Offering up to 4-times faster training performance, 3-times more memory capacity and 2-times improved energy efficiency compared to its predecessor, Trainium2 will be available in Amazon EC2 Trn2 instances, each containing 16 Trainium2 chips.

These instances enable customers to scale up to 100,000 Trainium2 chips in next-generation EC2 UltraClusters, connected with AWS Elastic Fabric Adapter for petabit-scale networking, delivering up to 65 exaflops of compute. This setup grants on-demand access to supercomputer-class performance, allowing customers to train a 300-billion parameter LLM in weeks instead of months. With Trn2 instances offering high cale-out ML training performance at significantly lower costs, they are poised to accelerate the next wave of advances in generative AI.

“Graviton4 marks the fourth generation we’ve delivered in just five years and is the most powerful and energy efficient chip we have ever built for a broad range of workloads,” said Brown. “And with the surge of interest in generative AI, Trainium2 will help customers train their ML models faster, at a lower cost, and with better energy efficiency.”

Graviton4 and Trainium2 mark the latest innovations in chip design from AWS. With each successive generation of chip, AWS delivers better price performance and energy efficiency, giving customers even more options — in addition to chip/instance combinations featuring the latest chips from third parties like AMD, Intel (News - Alert) and NVIDIA — to run any application or workload on Amazon EC2.

Edited by Alex Passett
Get stories like this delivered straight to your inbox. [Free eNews Subscription]