Built on 5nm process technology, Gaudi 3 has more tensor cores and matrix math engines compared to the previous model. This means a significant increase in processing power, especially in workloads that include the mission-critical BF16 data type.
The new accelerator has twice the FP8 performance of Gaudi 2, reaching 1835 TFLOPS. Although specific performance figures for BF16 have not been announced, Intel guarantees that the increase will be significant. Additionally, Gaudi 3 is equipped with massive 128GB HBMe2 memory and high-speed networking to ensure efficient processing of large datasets.
Unlike Nvidia’s proprietary networking solutions, Gaudi 3 supports open standards with 24 200 Gbps Ethernet ports. This helps increase the flexibility and scalability of AI computing clusters, allowing users to avoid vendor lock-in.
Intel claims that Gaudi 3 will outperform rivals like Nvidia H100 and H200 in terms of training speed, pinout and power efficiency across different models. This includes the advantage of handling long input and output sequences during output.
Source: Ferra

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.