Next-generation AI models are pushing the boundaries of creativity, problem-solving and automation. But still, there is a challenge outside of just making them smarter. It’s also about making them faster, more efficient and scalable on a massive level.
The real breakthrough lies in the infrastructure that supports them, and a recent announcement from CoreWeave signals a shift.
CoreWeave has consistently been at the forefront of AI infrastructure advancements. In August, the company was among the first to offer NVIDIA (News - Alert) H200 GPUs for fast GPT-3 training workloads. By November, it was demonstrating NVIDIA GB200 systems in action.
Now, the AI Hyperscaler officially became a cloud provider to make NVIDIA GB200 NVL72-based instances generally available. Built on the NVIDIA GB200 Grace Blackwell Superchip, this technology unlocks new levels of performance and scalability. In other words, it allows businesses to train, deploy and scale AI models with unprecedented speed and efficiency.
One of the biggest bottlenecks in AI development has been the limitations of server architecture, particularly in memory capacity and inter-GPU communication speeds. CoreWeave’s GB200 NVL72 instances eliminate these constraints by using rack-level NVLink connectivity and NVIDIA Quantum (News - Alert)-2 InfiniBand networking.
With 400Gb/s bandwidth per GPU, this setup creates easier communication across clusters of up to 110,000 GPUs. Additionally, NVIDIA Quantum-2’s SHARP In-Network Computing technology optimizes collective communication. It reduces latency and accelerates AI training times.
For companies relying on LLMs and other compute-intensive AI applications, CoreWeave’s new offering delivers up to 30 times faster real-time inference speeds compared to previous generations. It also achieves a 25-fold reduction in Total Cost of Ownership and energy consumption for real-time inference, while improving LLM training speeds by up to four times. This means AI teams can do more with fewer resources.
“Today's launch is another achievement of our series of firsts, and represents a force multiplier for businesses to drive innovation while maintaining efficiency at scale,” said Brian Venturo, co-founder and Chief Strategy Officer of CoreWeave. “CoreWeave's portfolio of cloud services — such as CoreWeave Kubernetes Service, Slurm on Kubernetes (SUNK) and our Observability platform— is purpose-built to make it easier for our customers to run, manage, and scale AI workloads on cutting-edge hardware. We're eager to see how companies take their AI deployments to the next level with NVIDIA GB200 NVL72-based instances on CoreWeave."
CoreWeave reinforces its position as a leader in high-performance AI cloud infrastructure. As demand for scalable, ultra-fast AI computing continues to rise, this latest innovation ensures businesses have the tools they need to push the boundaries of what’s possible.
Edited by Alex Passett