Generative AI brought in a new era in the field of AI and machine learning (ML). The stories are everywhere about it.
However, there are consequences of generative AI. One is the exponential increase in the scale and complexity of AI/ML clusters. The size and scope of these clusters have expanded with an escalating number of compute nodes and accelerators being integrated into the infrastructure. This surge in hardware components is directly linked to the growing demands of sophisticated AI models, which require immense computational power for training and inference.
Because of this, the design of AI/ML clusters has evolved. Traditional hierarchical architectures are being replaced by flat, high-capacity systems that can efficiently manage and process vast amounts of data. These systems are crucial for achieving low-latency and high-throughput operations, which are paramount for real-time and large-scale AI applications.
To facilitate this transition and optimize job completion times, Ethernet-based fabrics have entered the fold. Ethernet-based fabrics provide a scalable and high-performance interconnect solution that helps reduce communication bottlenecks within the cluster, enabling AI practitioners to harness the full potential of their computational resources.
As a response to this demand, Edgecore Networks (News - Alert), a provider of open networking solutions, has unveiled the DCS560, an advanced switch optimized for 800G operations. The switch is engineered to deliver an Ethernet-based fabric tailored for AI/ML workloads. The DCS560 is ingeniously designed as a compact 2RU (2 rack units) system, boasting a formidable 51.2 Terabits per second capacity and featuring a total of 64x800G ports.
Leveraging Edgecore's 51.2 Tbps Broadcom (News - Alert) StrataXGS Tomahawk 5 series-based architecture, this system ensures a high-radix Ethernet fabric that is highly amenable for deployment. Its 2RU form factor is sturdy and space-efficient, bolstered by redundancy in power and fan trays to achieve a remarkable five-nines level of high availability. It also boasts an extensive environmental operating range, making it well-suited for data center cloud applications.
Notably, the DCS560 adopts a load-balanced port mapping design, eliminating the need for flyover cables, thereby ensuring top-notch system quality and reliability while maintaining port flexibility for end-users.
It also offers a choice between OSFP800 and QSFP-DD800 interface options, catering to various deployment scenarios, including passive copper DAC connections and long-distance ZR+ optics. With each system delivering high-radix connectivity in a flat architecture, it contributes to reduced latency and power consumption, allowing networks to be expanded sustainably and efficiently.
“With the availability of Edgecore’s 800G system, hyperscalers will want to take advantage of the increase in radix and throughput offered by the 800G AI fabric together with the reduced power consumption,” said Heimdall Siao, President of Edgecore. “The innovative design of this groundbreaking 800G system will deliver a significant improvement in AI cluster performance and enable a lower total cost of ownership.”
Edited by Alex Passett