Durham University Puts New SmartNIC to the Test
- November 14, 2020
As a group, HPC practitioners are on the cutting edge of innovation, and are always willing to push the boundaries of performance to get to the next big breakthrough. For example, Dell Technologies customer Durham University is a leading research institution in the fields of astrophysics and particle physics. The University is using BlueField-2 DPUs to enable direct access to remote memory and improve the performance of massively parallel codes, which they believe may pave the way for future exascale systems.
The Durham University team has been using the half-height, half-width NVIDIA BlueField-2 SmartNIC cards in their “Durham Intelligent NIC Environment” (DINE), a 16-node supercomputer powered by Dell EMC PowerEdge C6525 servers. Each card is then connected with a 25G Ethernet cable, and configured to operate in a “host-separated” mode, providing direct access to the Arm cores on the DPU. Researchers can then launch HPC MPI codes across the cluster making use of both the AMD® EPYC™ server processors, and the Arm processors on the DPUs.
To test the DPU technology, the Durham team decided to compile two versions of their code—one that executes on the server processors, and one that executes on the DPU Arm cores. The team reported that recompiling the code for the Arm cores took seconds, while installing the necessary libraries took longer. However, they believe this will be faster in the future now that they have a recipe for the process. When they run a job across the DINE cluster, they direct MPI jobs to run on the DPUs instead of the CPUs, which allows the CPUs to carry on with their tasks without MPI interruptions.
The team believes the technology has the potential to become mainstream. Their faculty, staff, students, collaborators and other fundamental researchers will benefit from working with cutting-edge technologies that help them design algorithms and investigate ideas which will help redefine the future of HPC for facilities around the world.
Tobias Wienzieri, Project Principal Investigator (PI) for DINE said, “We have been suffering from a lack of MPI progress and, hence, algorithmic latency for quite a while and invested significant compute effort to decide how to place our tasks on the system. We hope that BlueField will help us to realize these two things way more efficiently. Actually, we started to write software that does this for us on BlueField.”
Based on the results at Durham University, the new NVIDIA SmartNICs with BlueFiled-2 DPUs are one step further on the journey to the infrastructure-as-code data center, where users can send a job out and have it run where it’s most optimized for performance and efficiency.
As data analytics, artificial intelligence (AI) and High Performance Computing (HPC) converge and mainstream, they’re driving new innovations designed to feed insatiable demands for compute performance. Accelerators—such as graphics processing units (GPUs) and field programmable gate arrays (FPGAs)—have revolutionized advanced computing over the last few decades, by offloading certain tasks from the CPU to speed workloads by several orders of magnitude compared to CPUs alone.
However, once Ethernet reaches 10 GB/s, the network interface card (NIC) starts to become the bottleneck, sapping cycles from the processor to handle increasingly complex system and data center networks. For example, when message passing interface (MPI) traffic is running on the CPU, HPC threads have to wait their turn, wasting compute cycles and slowing workload performance.
To counter this performance drain, the industry has been pushing the boundaries of software-defined networking (SDN), making the NIC smarter so that it can take over some of the processing functions that can slow down the CPU.
After acquiring networking powerhouse Mellanox in April of this year, NVIDIA began leading the charge on novel SDN technologies by announcing the NVIDIA® Mellanox® ConnectX-6 Lx SmartNIC. The SmartNIC is powered by the new NVIDIA BlueField technology, a high-performance, software programmable, multi-core system on a chip (SOC) CPU based on the Arm processing architecture.
The BlueField-2 Data Processing Unit (DPU) offloads critical network, security, and storage tasks from the CPU to boost performance, networking efficiency and security. With the BlueField-2 DPU enabled SmartNIC, the full infrastructure stack—compute, storage and networking—can be more granularly software-defined and disaggregated to handle larger volumes of data faster and more efficiently.