Component

SupremeRAID™ BeeGFS™ Performance with GIGABYTE Servers

by Graid Technology Inc.
The white paper explores how SupremeRAID™ with GIGABYTE S183-SH0 creates an extremely dense and efficient parallel filesystem solution and enhances the performance of BeeGFS—making it ideal for High-Performance Computing and Artificial Intelligence applications.
Executive Summary
SupremeRAID™ by Graid Technology uses GPU-based acceleration to deliver extremely high RAID performance. Using SupremeRAID™ avoids the inherent performance limitations in other RAID products, including ASIC-based hardware RAID and CPU-based software RAID.
This paper explores how SupremeRAID™ enhances the performance of BeeGFS, a parallel file system, developed and optimized for high-performance computing (HPC). Performance measurements occurred using StorageBench and IOzone. StorageBench is a BeeGFS benchmark that measures the streaming throughput of the underlying file system and devices independent of the network performance. IOzone tests a wide range of IO operations to simulate real-world workloads and is designed to find performance bottlenecks in the overall system.
Testing occurred using GIGABYTE servers operating as two storage nodes and four client nodes. Findings show exceptional storage and BeeGFS performance, as summarized in the following pages, demonstrating that choosing SupremeRAID™ for data protection is a highly effective way to maximize performance.
1. Two sets of twelve 7 GB/s SSDs configured as four RAID 5 groups. 2. Four 100G Ethernet links for a total of 400G.
The BeeGFS StorageBench benchmark, designed to measure raw storage performance, demonstrates impressive SupremeRAID™ 5 read speeds of 130.35 GB/s and write speeds of 70 GB/s. Also, the StorageBench RAID 5 read performance approaches the theoretical performance limit, with read and write performance significantly higher than the network bottlenecked IOzone benchmark, showcasing the superior storage performance of SupremeRAID™.
The IOzone benchmark, designed to simulate real-world client workloads that include network transmission overhead, is similarly impressive. Read and write speeds reach 45.10 GB/s and 42.97 GB/s, respectively, with 256 threads. Importantly, these figures approach the theoretical limit of a 400G network (50 GB/s), suggesting SupremeRAID™ can almost fully utilize a 400G network composed of four 100G network links.
Testing Background
Hardware: Storage Nodes (two)
• Server: GIGABYTE S183-SH0-AAV1 x 1
• Processor: Intel® Xeon® Platinum 8468H 48C 2.1GHz x 2
• Memory: Micron MTC20F2085S1RC48BA1 DDR5 32GB 4800MHz x 16
• Network Card: ConnectX-5 Ex MCX556A-EDAT EDR x 1
• SSD: SAMSUNG MZTL23T8HCLS-00A07 3.84TB x 16
• RAID Controller: RAID Controller: SupremeRAID™ SR-1010 x 1
 
Hardware: Client Nodes (four)
• Server: GIGABYTE H242-Z10 x 4 (four-node system)
• Processor: AMD EPYC 7663 56C x 2
• Memory: Micron HMA82GR7CJR8N-XN DDR4 16GB 3200MHz x 16
• Network Card: ConnectX-5 Ex MCX556A-EDAT EDR x 1
 
Software: Storage Node
• Operating System: Red Hat Enterprise 8.8
• Kernel: 4.18.0-477.13.1.el8_8.x86_64
• BeeGFS: 7.3.3
• SupremeRAID™ Driver: 1.5.0
• OFED: 5.8-2.0.3.0

Software: Client Nodes
• Operating System: Red Hat Enterprise 8.8
• Kernel: 4.18.0-477.13.1.el8_8.x86_64
• BeeGFS: 7.3.3
• SupremeRAID™ Driver: 1.5.0
• OFED: 5.8-2.0.3.0
• IOzone: 3-506.x86_64

Software: Client Nodes
• Operating System: Red Hat Enterprise 8.8
• Kernel: 4.18.0-477.13.1.el8_8.x86_64
• BeeGFS: 7.3.3
• SupremeRAID™ Driver: 1.5.0
• OFED: 5.8-2.0.3.0
• IOzone: 3-506.x86_64
Cluster Architecture
Networking
Each storage node is equipped with a dual-port 100G network card, while each client node features a single-port 100G network card. All two storage nodes and four client nodes are interconnected using a 100G switch.

Storage
Each storage node is equipped with 16 NVMe drives, with eight located at CPU0 and the remaining eight at CPU1. A single SupremeRAID™ SR-1010 RAID controller, positioned at CPU0, manages all 16 NVMe drives. Two Metadata Services (MDS) are set up, each supported by a RAID1 group composed of two drives. Additionally, two RAID 5 groups were constructed, each incorporating six drives. Each RAID 5 group generated three virtual drives for three separate Object Storage Services (OSS). Altogether, the cluster consists of four MDSs and twelve OSSs.
Testing Profiles
BeeGFS StorageBench
Upon the successful construction of the cluster, we utilized the intrinsic BeeGFS StorageBench tool to gauge the performance of the NVMe drives and the RAID controller. The evaluation process commenced with a write test aimed at establishing the test file. This procedure involved a block size of 1M and employed 64 threads. Furthermore, to bypass potential influences from the VFS cache and subsequently reveal the genuine performance capabilities of the storage system, we incorporated the --odirect option.
Upon completion of the write test, we transitioned to the read test phase.
IOzone
To evaluate the cluster's performance under realistic workloads, we utilized IOzone to generate I/O from four client nodes at various I/O depths. This involved conducting both read and write workloads with a 1M block size and a file size of 16GB for each thread. Additionally, the -I option was specified to allow for direct I/O.
Testing Results
The BeeGFS StorageBench benchmark, designed to measure raw storage performance, demonstrates impressive results under a RAID 5 protected environment. The read and write speeds observed during this benchmark peak at 130.35 GB/s and 70 GB/s across four RAID 5 groups, respectively, as depicted in the chart titled "BeeGFS StorageBench Results vs. IOzone Results". StorageBench RAID 5 read performance approaches the theoretical performance limit, with read and write performance significantly higher than the network bottlenecked IOzone benchmark, showcasing the superior storage performance of SupremeRAID™.

In contrast, the IOzone benchmark simulates real-world client workloads, incorporating network transmission overhead. Although performance in this scenario is lower than the StorageBench results, it remains impressive. The read and write speeds reach 45.10 GB/s and 42.97 GB/s, respectively, with 256 threads. Importantly, these figures approach the theoretical limit of a 400G network (50 GB/s), suggesting SupremeRAID™ can almost fully utilize a 400G network (4 x 100G).
BeeGFS StorageBench Results vs. IOzone Results
IOzone read/write Performance Across Varied Thread Counts
Summary
To summarize, SupremeRAID™ delivers high performance under raw storage and real-world workload scenarios. As demonstrated by the BeeGFS StorageBench results, SupremeRAID™ achieves remarkably high storage performance levels under RAID5 protection. Furthermore, the IOzone results reveal that SupremeRAID™ can efficiently handle real-world client workloads while optimally utilizing high-speed network infrastructure.

When coupled with GIGABYTE S183-SH0, we can provide an extremely dense and efficient parallel filesystem solution. The ability to offer up to 398.32TB per 1U when all 32 bays are fully populated makes it an ideal solution for High-Performance Computing (HPC) and Artificial Intelligence (AI) applications. SupremeRAID™, in conjunction with GIGABYTE S183-SH0, brings together exceptional performance and maximum storage efficiency, establishing it as a premier choice for HPC/AI scenarios.
Conclusion
SupremeRAID™ by Graid Technology uses GPU-based acceleration to deliver extremely high RAID performance. Using SupremeRAID™ avoids the inherent performance limitations in other RAID products, including ASIC-based hardware RAID and CPU-based software RAID. The efficient utilization of SSD performance is improved by SupremeRAID™ software version 1.5, delivering significant gains.
StorageBench and IOzone benchmark testing confirms high storage and BeeGFS performance when using SupremeRAID™ and GIGABYTE servers. StorageBench results demonstrate storage performance matching the aggregate of sixteen SSDs and BeeGFS performance approaching the theoretical limits of 400G.
The performance benefits of using SupremeRAID™ and GIGABYTE for BeeGFS include:
• Up to 130.35 GB/s storage performance.
• Up to 45.10 GB/s BeeGFS performance.
Deployment Details
For all Nodes
Install the RHEL 8.8 on all servers.
Setup Networking
1. Install the OFED package on all servers.
2. Configure and start an InfiniBand subnet manager on a server.
3. Verify the InfiniBand (IB) status.
Storage Nodes
Install the SupremeRAID™ Drivers
1. Download the Pre-installer and Installer.
2. Run the pre-installer to install the necessary packages.
3. Execute the installer to install the SupremeRAID™ driver.
4. Apply the license key to activate the SupremeRAID™ service.
Install the BeeGFS Packages
1. Add the BeeGFS repo on all servers.
2. Install the BeeGFS packages.
Setup the RAID Array for BeeGFS
1. Verify the SSD NUMA location. Ensure that eight drives are from NUMA0 and eight from NUMA1.
2. Create 16 NVMe drives as physical drives.
3. Construct two RAID1 groups and two RAID5 groups.
4. Generate virtual drives for 2 MDSs and 6 OSS
5. Format the virtual drives with the appropriate file systems for MDS (ext4) and OSS (xfs).
Setup the BeeGFS Management Service
Setup Multiple BeeGFS MDSs and OSSs in the Storage Node

1. Create 2 MDS folders and 6 OSS folders.
2. Copy the beegfs-meta config file to the MDS folder.
3. Modify the beegfs-meta TCP/UDP port for each MDS to prevent port conflict.
4. Copy the beegfs-storage config file to the OSS folder.
5. Modify the BeeGFS-storage TCP/UDP port for each OSS to prevent port conflict.
6. Place the interface file in the /etc/beegfs folder.
7. Set the BeeGFS mount point.
8. Initialize the MDSs and OSSs.
9. Start the MDS and OSS services.
10. Open the firewall's port.
11. Reload the firewall service.
Client Nodes
Install the BeeGFS Packages
1. Add the BeeGFS repo on all servers.
2. Install the BeeGFS client packages.
Setup the BeeGFS Client
1. Configure the build option on the client servers in the beegfs-client-autobuild.conf file.
2. Enforce a rebuild of the client kernel modules.
3. Initialize the client service on the client servers.
4. Restart the BeeGFS client service.
BeeGFS Tuning
Object Storage Service
Metadata Service
Client
Filesystem
Introduction
SupremeRAID™
SupremeRAID™ next-generation GPU-accelerated RAID eliminates the traditional RAID bottleneck to unlock the full performance and value of your NVMe SSDs. As the world's fastest RAID cards for PCIe Gen 3, 4, and 5 servers, SupremeRAID™ is designed to deliver superior performance while increasing scalability, improving flexibility, and lowering the total cost of ownership (TCO). A single SupremeRAID™ card blasts performance to up to 28M IOPS and 260 GB/s.
• Flexible & Future Ready – Unmatched flexibility with features added using software only.
• World Record Performance – Delivers the speed to power high-performance applications.
• Liberate CPU Resources – Offloads RAID computations to the SupremeRAID™ GPU card.
• Plug & Play Capability – Add into any open PCle slot with no cabling re-layout required.
• Highly Scalable Applications – Easily manage up to 32 direct-attached NVMe SSDs.
• User-Friendly Management – Doesn't rely on memory caching to improve performance.
BeeGFS and StorageBench
The BeeGFS parallel file system, developed and designed by ThinkParQ®, delivers high performance, ease of use, and simple management for performance-oriented environments and workloads. Typical examples include high-performance computing, artificial intelligence, media and entertainment, oil and gas, and life sciences. BeeGFS is often considered to be easier to deploy and manage than other parallel file systems on the market and includes the StorageBench storage benchmarking tool.
IOzone
The IOzone synthetic benchmark tests for file system performance using various operations, including read, re-read, write, re-write, and random mix. Testing occurs depending on the options specified using a command line, with numerous types and combinations of test operations supported.
Realtion Tags
Parallel File System
Big Data
All Flash
AI & AIoT
AI Training
NVMe
RAID
RELATED ARTICLES
How to Benefit from AI  In the Healthcare & Medical Industry

AI & AIoT

How to Benefit from AI In the Healthcare & Medical Industry

If you work in healthcare and medicine, take some minutes to browse our in-depth analysis on how artificial intelligence has brought new opportunities to this sector, and what tools you can use to benefit from them. This article is part of GIGABYTE Technology’s ongoing “Power of AI” series, which examines the latest AI trends and elaborates on how industry leaders can come out on top of this invigorating paradigm shift.
Czech’s biggest search engine builds infrastructure on top of GIGABYTE solutions

Success Case

Czech’s biggest search engine builds infrastructure on top of GIGABYTE solutions

Seznam, a major player in the Czech Republic internet market, in a search to upgrade its hardware infrastructure, found in GIGABYTE the ideal hardware solution partner to increase scalability and performance of its online services, resulting in major improvements in efficiency.
How to Pick the Right Server for AI? Part Two: Memory, Storage, and More

Tech Guide

How to Pick the Right Server for AI? Part Two: Memory, Storage, and More

The proliferation of tools and services empowered by artificial intelligence has made the procurement of “AI servers” a priority for organizations big and small. In Part Two of GIGABYTE Technology’s Tech Guide on choosing an AI server, we look at six other vital components besides the CPU and GPU that can transform your server into a supercomputing powerhouse.
How to Pick the Right Server for AI? Part One: CPU & GPU

Tech Guide

How to Pick the Right Server for AI? Part One: CPU & GPU

With the advent of generative AI and other practical applications of artificial intelligence, the procurement of “AI servers” has become a priority for industries ranging from automotive to healthcare, and for academic and public institutions alike. In GIGABYTE Technology’s latest Tech Guide, we take you step by step through the eight key components of an AI server, starting with the two most important building blocks: CPU and GPU. Picking the right processors will jumpstart your supercomputing platform and expedite your AI-related computing workloads.
Get the inside scoop on the latest tech trends, subscribe today!
Get Updates