AMD Instinct MI200 Series Platform | Solution

Accelerators are Here to Stay

Even with the best processor on the market, the AMD EPYC™ 7003 series CPUs require accelerators for HPC and data intensive workloads.

What shifts are taking place in data centers?
A dedicated powerful, parallel-processing accelerator is a necessity Exascale computing is a reality due to scalability of systems An open software platform leads the way into exascale and supercomputing systems with AMD ROCm 5.0 Data centers have a need for GPU PCIe cards and more powerful OAM form factor GPUs

GIGABYTE has created and tailored passive cooling servers for the AMD Instinct™ MI250 (OAM) while expanding support and testing for servers with the MI210 (PCIe) offering. The new AMD Instinct MI200 series accelerator is designed for dense computing in our GIGABYTE 2U server. This means that the technology is not only able to scale for large computing clusters, but also to support smaller deployments such as a single HPC server.

Fields that Benefit from High Parallel Processing

HPC & AI

HPC and AI go hand in hand. HPC has the compute infrastructure, storage, and networking that lays the groundwork for AI training with accurate and reliable models. Additionally, there are a lot of precision choices for either HPC or AI workloads.

Cloud

On-premise HPC continues to grow, yet cloud HPC is growing at a faster rate. By moving to the cloud companies can quickly and easily utilize compute resources by demand. Cloud computing can use the latest and greatest technology.

Engineering & Sciences

Big data and computational simulations are common needs of engineers and scientists. High parallel processing, low latency, and high bandwidth help create an environment for server virtualization.

AMD Instinct - Innovative Technology

AMD Innovations Pave the Way
In architecture, packaging and integration are pushing the boundaries of computing by unifying the most important processors in the data center, the CPU and the GPU accelerator. With Industry-first multi-chip GPU modules along with 3rd Gen AMD Infinity Architecture, AMD is delivering performance, efficiency and overall system throughput for HPC and AI using AMD EPYC™ CPUs and AMD Instinct™ MI200 series accelerators.

AMD Instinct™ MI200 series accelerators powered by 2nd Gen AMD CDNA™ architecture, are built on an innovative multi- chip design to maximize throughput and power efficiency for the most demanding HPC and AI workloads
With AMD CDNA™ 2, the MI250 has new Matrix Cores delivering up to 7.8X the peak theoretical FP64 performance vs. AMD previous Gen GPUs and offers the industry's best aggregate peak theoretical memory bandwidth at 3.2 terabytes per second.
3rd Gen AMD Infinity Fabric™ technology enables direct CPU to GPU connectivity extending cache coherency and allowing a quick and simple on-ramp for CPU codes to tap the power of accelerators.
AMD Instinct MI250 accelerators with advanced GPU peer-to-peer I/O connectivity through eight AMD Infinity Fabric™ links deliver up to 800 GB/s of total aggregate theoretical bandwidth.
The Frontier supercomputer, one of the first Exascale supercomputer, is the first to offer a unified compute architecture powered by AMD Infinity Platform™ based nodes.

Ecosystem without Borders
AMD ROCm™ is an open software platform allowing researchers to tap the power of AMD Instinct™ accelerators to drive scientific discoveries. The ROCm platform is built on the foundation of open portability, supporting environments across multiple accelerator vendors and architectures. With ROCm 5, AMD extends its open platform powering top HPC and AI applications with AMD Instinct™ MI200 Series accelerators, increasing accessibility of ROCm for developers and delivering leadership performance across key workloads.

About ROCm 5:

ROCm™ - Open software platform for science used worldwide on leading exascale and supercomputing systems
ROCm™ 5 extends the AMD open platform for HPC and AI with optimized compilers, libraries and runtimes support for MI200
Open & Portable – ROCm™ open ecosystem supports heterogenous environments with multiple GPU vendors and architectures
ROCm™ 5 library optimizations using new MI200 features: FP64 Matrix ops, reduced kernel launch overhead, Packed FP32 math and FP64 atomics support
ROCm™ upstream support of key industry frameworks: TensorFlow, PyTorch and ONNX-RT7.AMD Infinity Hub providing researchers, data scientists and end-users a quick and easy way to find, download and install optimized containers for HPC apps and Machine Learning frameworks supported on AMD Instinct™ MI200series accelerators and ROCm™.
AMD Infinity Hub: Ready-to-deploy software containers and guides for HPC, AI & Machine Learning.

G262-ZO0 with AMD EPYC CPUs and AMD Instinct Accelerators

Benefits to the MI200 Series Accelerator

High-performance

MI200 series leads in HPC and AI performance for various math precision. MI250: 45.3 TFLOPS FP64/FP32 Vector and,MI210: 22.6 TFLOPS.

Scalability

The high memory bandwidth and HBM2e memory capacity in the MI210 and MI250 (multi-chip module) allow for a high degree of scalability.

Connections

For AMD Infinity Fabric Links, MI250 supports up to six links and MI210 has up to three for fast communication for GPU P2P and between GPU and CPU.

suitable to User Friendly, Ease of Use& Lower maintenance requirement

Leadership

The MI200 series accelerators are the industry-first multi-chip GPU for excellent throughput and efficiency. Supercomputer Frontier chose the MI250.

All AI Workloads

The MI200 accelerator is optimized for BF16, INT4, INT8, FP16, FP32, and FP32 Matrix to meet all your AI system requirements.

AMD Instinct™ MI250 Accelerator

Model	MI250 OAM
Compute Units	208 CU
Stream Processors	13,312
Peak FP64/FP32 Matrix	90.5 TF
Peak FP64/FP32 Vector	45.3 TF
Peak FP16/BF16	362.1 TF
Memory Size	128GB HBM2e
Memory Clock	1.6GHz
Memory Bandwidth	Up to 3.2 TB/sec
Bus Interface	PCIe Gen4
Infinity Fabric Links	Up to 6
Max Power	560W TDP

AMD Instinct™ MI210 Accelerator

Model	MI210 PCIe
Compute Units	104 CU
Stream Processors	6,656
Peak FP64/FP32 Matrix	45.3 TF
Peak FP64/FP32 Vector	22.6 TF
Peak FP16/BF16	181.0 TF
Memory Size	64GB HBM2e
Memory Clock	1.6GHz
Memory Bandwidth	Up to 1.6 TB/sec
Bus Interface	PCIe Gen4
Infinity Fabric Links	Up to 3
Max Power	300W TDP