Dell's $150M Investment Fuels Imbue's Next-Gen AI Model Development

By Greg Tavarez, TMCnet Editor  |  December 11, 2023

You know the deep learning tasks with computational demands, such as training large neural networks with millions or billions of parameters like GPT-3 or GPT-4? Or data-intensive tasks like image recognition, where models like convolutional neural networks require vast datasets for effective training, often consisting of millions of labeled images?

Well, high-performance computing clusters for training foundation models are crucial for these tasks because these clusters provide the necessary infrastructure, including powerful CPUs, GPUs and accelerators, interconnected through fast networks, to efficiently handle parallelized processing. The scalability and distributed nature of these clusters enable researchers and practitioners to tackle increasingly complex models and larger datasets.

With robust storage solutions, job scheduling systems and deep learning frameworks, these clusters contribute to reducing training times, fostering innovation and accelerating the development of sophisticated AI applications across various domains.

Imbue, an AI independent research company, develops its own foundation models and trains them to have more advanced reasoning capabilities. These capabilities include knowing when to ask for more information, analyzing and critiquing their own outputs or breaking down a difficult goal into a plan and then executing on it.

Imbue also trains AI agents on top of those models that can do work for people across diverse fields in ways that are robust, safe and useful. The reason is to create practical tools for building agents that could enable workers across a broad set of domains, including helping engineers write new code, analysts understand and draft complex policy proposals.

Taking that goal further, Imbue looked to Dell and entered into a $150 million agreement to build a new high-performance computing cluster for training foundation models optimized for reasoning.

"The purpose of technology is to drive human progress, and this often begins at the research level," said Jeff Boudreau, Chief AI Officer at Dell (News - Alert). "Dell technology will provide Imbue with the powerful engine to help unearth the next generation of impactful AI innovation."

Utilizing a cluster powered by Dell PowerEdge XE9680 servers featuring NVIDIA (News - Alert) H100 Tensor Core GPUs, Imbue currently leverages the infrastructure for training AI models and creating prototype agents capable of bug correction in code and analyzing extensive documents.

The Dell systems are built for extreme acceleration for AI, machine learning and deep learning training. They are equipped to deploy AI computing initiatives with high GPU memory, bandwidth and security. The PowerEdge servers' Smart Cooling features sustain great performance more efficiently while reducing the data center's overall carbon footprint.

As mentioned before, Imbue's long-term vision involves developing advanced and reliable AI agents that operate autonomously, eliminating the need for continuous user supervision. This innovation aims to enable agents, for instance, to plan vacations on behalf of users rather than merely generating travel ideas, providing individuals with more leisure time.

Collaboratively designed by Imbue and Dell, the system incorporates smaller clusters for swift experimentation with novel model architectures and seamless networking into a larger cluster, optimizing the efficient training of large-scale foundation models.

"Building a new generation of foundation models requires the very best IT infrastructure, and Dell Technologies has helped us deploy a custom cluster much more quickly than other providers could have," said Josh Albrecht, Chief Technology Officer of Imbue. "Dell has been an invaluable collaborator as we pursue our work to create AI systems with much stronger reasoning abilities."

Edited by Alex Passett
Get stories like this delivered straight to your inbox. [Free eNews Subscription]