Real time data processing at the source is required for edge computing with reduced latency for Internet of Things (IoT) and 5G networks as they use cloud.
Besides the central processing unit (CPU), the graphics processing unit (GPU) is also an important part of a high-performing server. Do you know how a GPU works and how it is different from a CPU? Do you know the best way to make them work together to deliver unrivalled processing power? GIGABYTE Technology, an industry leader in server solutions that support the most advanced processors, is pleased to present our latest Tech Guide. We will explain the differences between CPUs and GPUs; we will also introduce GIGABYTE products that will help you inject GPU computing into your server rooms and data centers.
One of the most pivotal breakthroughs in modern computing is the realization that processors can be built differently to make them better suited to specific tasks. As we talked about in our Tech Guide about server processors, the server’s central processing unit, or CPU, is designed to carry out “instruction cycles” that comprise the server’s primary workload. This can range from hosting a webpage to analyzing signals from outer space that will help scientists find a second Earth. Naturally, people started wondering—is there a way to make more specialized processors for different kinds of workloads?
Let’s first explain the basics of a CPU. A CPU is a computer chip made up of hundreds of millions, if not billions of transistors. These transistors are used to perform calculations—the aforementioned instruction cycles. From start to finish, an instruction cycle can be broken down into four separate steps: fetch, decode, execute, and write-back. Based on which part of the calculating process the transistors are involved in, they are assigned different functions in the CPU, such as the program counter, the CPU register, the arithmetic logic unit (ALU), etc. Since a CPU is general-purpose and must be capable of completing any instruction cycle, enough computing resources must be allocated to each function, so that the chip is not pigeonholed into only being capable of filling a limited number of roles.
The Origin of the GPU
But what if that’s exactly the kind of chip we want—something that’s very good at some very specific tasks, such as generating computer graphics? Enter the graphics processing unit, or GPU for short. As the name suggests, it was originally invented to help render images on display devices. Mechanically, a GPU is very similar to a CPU, in that it is made up of many of the same components, such as the ALU, the control unit, the cache, etc. Like the CPU, the GPU completes instruction cycles by using its components to perform the calculations necessary to deliver results.
Where the two differ is the design and configuration of their analogous components. Compared with a CPU, a GPU tends to have more cores, less complicated control units and ALUs, and smaller caches. While the structure of a CPU allows it to excel at serial processing (read: completing one instruction cycle at a time), a GPU takes advantage of its large number of cores to engage in parallel computing, which is the practice of breaking a task down into multiple parts and running them simultaneously on multiple cores to accelerate processing. The upshot of this is that while a CPU can theoretically complete any task, a GPU can complete some simpler, more specific tasks—such as creating computer graphics—very efficiently.
More astute readers may already have guessed the next logical step. People discovered that rendering graphics was not the only thing the GPU was good for. A general-purpose graphics processing unit, or GPGPU, is used to support the CPU in tasks other than generating images. In the server industry, when we compare GPUs to CPUs or talk about GPU computing, it is generally in reference to GPGPUs.
What are the Key Differences between CPU and GPU?
To understand why an advanced server needs both top-line CPUs and GPUs to progress to the upper echelons of high performance computing (HPC), we need to delve deeper into how the two types of processors are similar but different, and how they complement one another.《Glossary: What is HPC?》
So, right off the bat, it’s important to explain that no server can operate without a CPU, just as no car can run without an engine. The CPU is the main “calculator” that receives your instructions and works with all the other components to get the job done. It tends to have a higher clock rate and lower latency, its ALUs and control units are more complex, and it has a bigger cache. This ensures that the CPU has the capacity and flexibility to run any kind of calculation and complete even the most complicated instruction cycles.
The components that make up CPUs and GPUs are analogous; they both comprise control units, arithmetic logic units (ALU), caches, and DRAM. The main difference is that GPUs have smaller, simpler control units, ALUs, and caches—and a lot of them. So while a CPU can handle any task, a GPU can complete certain specific tasks very quickly.
Where the CPU runs into trouble is when it is bogged down by a deluge of relatively simple but time-consuming tasks. It is like asking a head chef to flip a hundred burgers at a greasy spoon. They can do it no problem, and they can do it well, but in the meantime the entire kitchen is idly waiting for the big cheese to come back and tell them what to do! A GPU, on the other hand, has smaller caches, simpler ALUs and control units, but higher throughput, and also cores for days. It was designed to help the CPU complete uncomplicated but repetitive calculations very quickly—like some kind of burger-flipping machine. It is ideally suited for generating every pixel of computer graphics we see on screen.
In effect, we can summarize four key differences between CPUs and GPUs, as below:
- Core count: Due to its complexity, a CPU generally has fewer cores, with even ARM CPUs based on the RISC architecture topping out in the low hundreds. The GPU’s simpler design allows it to have up to thousands of cores.
- Components: Because a CPU is designed to handle every kind of calculation, it is equipped with a comprehensive toolset, including a bigger cache to storing data, and more intricate control units and ALUs. A GPU only needs small caches and basic control units and ALUs to complete the simple, repetitive tasks that are its forte.
- Operation: Since a CPU carries out instruction cycles sequentially—that is, one at a time—it is usually designed with higher clock rates and lower latency to expedite the process. A GPU, however, achieves acceleration through parallel computing, which means it breaks down a task into smaller parts and executes them concurrently on multiple cores.
- Specialty: It’d be a waste to ask the CPU to carry out easy instruction cycles, because it is designed to handle every kind of calculation, no matter how complicated. The GPU, on the other hand, was patently designed for simple calculations, such as the type necessary to project images onto our computer screens.
The table above is a handy comparison of CPUs and GPUs. Through heterogeneous computing, different computing tasks can be allocated to the most suitable processors. This will help to accelerate computing speed and make sure you squeeze every drop of performance out of your server.
What people have come to realize about the GPU is that its attributes make it perfect for big data analysis, machine learning, AI development, and other important trends in computer science. To use computer vision as an example, the AI model is capable of reading license plates and even handwriting because it was trained on a deluge of graphical data—that is, it practiced guessing what’s in a picture by looking at millions, even billions of pictures, until its parameters were perfectly adjusted. While AI training can be done with CPUs, it would not be as efficient without the help of GPUs, which excel at processing a lot of uncomplicated data very quickly through parallel computing. Hence, GPUs are essential to AI training and inference, and the most powerful AI servers on the market almost predominantly feature GPU modules and accelerators that help the CPUs develop the AI models with the greatest efficacy.
How to Use Both CPUs and GPUs to Achieve Maximum Performance?
At this point in the Tech Guide, it should be pretty clear that while GPUs cannot operate without a CPU, and CPUs are in theory capable of running the whole show by themselves, the smartest way to go about things is to have an optimal configuration of both CPUs and GPUs to handle the tasks that they are better suited for. Collaboration is the key here: rather than having a single polymath trying to do everything by their lonesome, or a gaggle of savants struggling with a job outside of their areas of expertise, the wise employer keeps both types on the payroll and marries the best of both worlds for maximum results.
A phrase that is often passed about in the server industry is “heterogeneous computing”. It describes the process we have just explained: the leveraging of different kinds of processors to make sure the right tools are being used for any given task. This is the tried-and-true method of squeezing every drop of performance out of a server. In fact, there are many other types of computer chips besides CPU, GPU, and DPU. For example, there is the vision processing unit (VPU), which shines in computations related to computer vision; the application-specific integrated circuit (ASIC); and its opposite number, the field-programmable gate array (FPGA). It has gotten to a point that if you have not adopted heterogeneous computing in your server solution, then chances are, your server is not being asked to be the best that it can be.
How to Inject GPU Computing into Your Server Solution?
We have now come to everyone’s favorite part of our Tech Guides. This is where we tell you how you can get both CPUs and GPUs to work together for you to exceed your expectations.
GIGABYTE Technology has a wide range of server solutions designed to support GPU computing. Foremost among them is the G-Series GPU Servers—it is right there in the name. The G-Series offers the dual benefits of a high number of GPU slots and blazing-fast data transmission thanks to PCIe technology. To use the showstopping G591-HS0 as an example, this gem offers up to 32 low-profile half-length GPU slots in a 5U chassis (each U is a rack unit measuring 1.75 inches high). Its CPU is the Intel® Xeon® Scalable processor. The combination of the CPU’s considerable processing power with cutting-edge GPU acceleration makes it abundantly clear why GPUs have become a mainstay of the supercomputing sector.
Of course, the G-Series is not GIGABYTE’s only server product line suited for GPU computing. The H-Series High Density Servers, which specialize in extremely dense processor configurations; the general-purpose R-Series Rack Servers; the compact E-Series Edge Servers for edge computing; and the W-Series Workstations that allow users to deploy enterprise-grade computing right on the desktop—all these models support a varying number of GPU accelerators depending on the customers’ needs.《Glossary: What is Edge Computing?》
Another way to gauge if you can profit from adding GPUs into the mix is by looking at what you will use your servers for. Obviously, there are unlimited tasks you could be doing with your servers, so there is no way to list them all here. What we can do, however, is share how industry leaders in different sectors are bolstering their CPUs with GPUs, so we may glean some insight from their success stories.
By injecting GPU computing into your server solutions, you will benefit from better overall performance. GIGABYTE Technology offers a variety of server products that are the ideal platforms for utilizing advanced CPUs and GPUs. They are used in AI and big data applications in weather forecasting, energy exploration, scientific research, etc.
CERN: GPU Computing for Data Analysis
The European Organization for Nuclear Research (CERN) uses the world-famous Large Hadron Collider (LHC) to conduct subatomic particle experiments that advance human knowledge in the field of quantum physics. The challenge they run into is what happens when the LHC slams particles against one another. A lot of raw data is generated—upwards of 40 terabytes every second. This information must be analyzed to help scientists detect new types of quarks and other elementary particles that are the building blocks of our universe.
CERN chose GIGABYTE’s G482-Z51, a GPU Server which supports AMD EPYC™ CPUs and up to 8 PCIe Gen 4.0 GPUs, to crunch the huge amount of data generated by their experiments. Heterogeneous computing between the processors is enhanced by GIGABYTE’s integrated server design, which maximizes signal integrity by minimizing signal loss in high-speed transmissions. This results in a server solution that features higher bandwidth, lower latency, and unsurpassed reliability.
Another data analysis-related example is the story of a renowned French geosciences research company, which provides geophysical data imaging and seismic data analysis services for customers in the energy sector. To quickly and accurately analyze the complex 2D and 3D images gathered during geological surveys, the client chose GIGABYTE’s industry-leading G291-280, which supports AMD EPYC™ CPUs and an ultra-dense configuration of up to 8 dual-slot GPUs in a 2U chassis. The GPU Server was deployed with revolutionary immersion cooling technology to further unlock the processors’ full potential while reducing power consumption and carbon emission.
Waseda University: GPU Computing for Computer Simulations and Machine Learning
Japan’s Waseda University, known in academic circles as the “Center for Disaster Prevention around the World”, uses GIGABYTE servers in a computing cluster to run simulations on extreme weather phenomena, such as typhoons or tsunamis. These simulations can predict the impact of an approaching storm and help the government come up with a meticulous response plan. The servers can analyze a sea of meteorological data and generate simulations that are so detailed, each individual resident in an affected area can be represented in the computer model.
Waseda University used GIGABYTE’s G221-Z30 GPU Server and W291-Z00 Workstation to build the cluster. The G221-Z30 was outfitted with AMD EPYC™ CPUs, GPU accelerators, and a massive amount of memory and storage so it could fulfill the role of the control node. Not only did the GIGABYTE cluster help to improve prediction accuracy, it also reduced the time it took to complete a simulation by as much as 75%. Thanks to parallel computing, multiple simulations could be carried out simultaneously. The servers also supported applications that involved computer vision, machine learning, neural network technology, and Long Short-Term Memory (LSTM).《Glossary: What is Node?》
Cheng Kung University: GPU Computing for Artificial Intelligence
If you are curious how GPUs and CPUs can work together to accelerate AI development, the story of Cheng Kung University’s award-winning supercomputing team will be a worthwhile read. As part of GIGABYTE’s long-term commitment to corporate social responsibility (CSR) and environment, social, and corporate governance (ESG), GIGABYTE provided G482-Z50 GPU Servers to help NCKU’s student team practice for the APAC HPC-AI Competition in 2020.
A part of the competition required contestants to make a breakthrough in international NLP (natural language processing) records by using BERT, a machine learning technique developed by Google. GIGABYTE’s G482-Z50 was the ideal tool because it can support up to 10 PCIe Gen 3.0 GPU cards; each of the server’s dual CPUs is connected to 5 GPUs through a PCIe switch, which minimizes communication latency. NCKU outfitted their GIGABYTE servers with GPU accelerators made by NVIDIA. They ultimately achieved an accuracy rate of 87.7% using BERT, which was higher than what had been achieved by the University of California, San Diego (87.2%) and Stanford University (87.16%). Needless to say, they took home the gold.《Glossary: What is Natural Language Processing?》
We hope this Tech Guide has been able to explain the differences between CPUs and GPUs; we also hope it has clarified why you should consider utilizing them both through heterogeneous computing if you work with AI, HPC, or other exciting trends in computer science. If you are looking for server solutions that can help you benefit from the most advanced CPUs and GPUs, talk to GIGABYTE! We encourage you to reach out to our sales representatives at marketing@gigacomputing.com for consultation.