GIGABYTE’s ARM Server Boosts Development of Smart Traffic Solution by 200%
A team of scientists at NTU has adopted GIGABYTE’s G242-P32 server and the Arm HPC Developer Kit to incubate a “high-precision traffic flow model”—a smart traffic solution that can be used to test autonomous vehicles and identify accident-prone road sections for immediate redress. The ARM-based solution gives the project a 200% boost in efficiency, thanks to the cloud-native processor architecture that “speaks” the same coding language as the roadside sensors, the high number of CPU cores that excel at parallel computing, the synergy with GPUs that enable heterogeneous computing, and the ISO certifications which make the resulting model easily deployable for automakers and government regulators alike.
Dr. Chi-Sheng Shih, Professor and Director at the Graduate Institute of Networking and Multimedia at Taiwan University (NTU), is leading a team of scientists to develop a “high-precision traffic flow model” of Taiwan’s roads and highways. The benefits of such a model are twofold. One, developers of autonomous vehicles and ADAS can conduct simulations to test their creations, while government regulators can run safety checks before greenlighting a new product. Two, existing “accident-prone road sections”—locations which exhibit a higher frequency and greater severity of vehicular accidents—can be quickly identified, so steps can be taken to prevent more accidents and save lives. The model is already being tested on roads in northern and central Taiwan. The team is in talks with Tier IV, Inc., a deep-tech startup based in Japan, about incorporating the finished product into Autoware, the world’s leading open-source software project for autonomous driving; this will pave the way for broader adoption and the possibility of commercialization.
How are Dr. Shih and his team developing the model? First, three or four sensor packets, each composed of a lidar and three cameras, are installed along a stretch of road around a hundred to two hundred meters long. During each batch of testing, the sensors gather data from the traffic flow for a duration of around two hours. Data points include the number of vehicles, vehicular speed, the distance between each vehicle, etc. Then, the data is taken back to the computer lab to be processed. The end result is a highly precise computer model that shows intricate details about the traffic flow; it is a kind of digital twin that can be used for mobility simulation and modelling, which is a key component of a smart traffic solution.《Glossary: What is Digital Twin?》
“Our goal is to serve as the Qianliyan and Shunfeng'er of autonomous vehicles,” says Dr. Shih, citing two deities from Chinese mythology known for their far-seeing eyes and all-hearing ears. Not only can the computer model improve the positioning accuracy and safety of self-driving cars, it can also be used to analyze and fine-tune traffic flow, which is beneficial for all vehicles, autonomous or otherwise.
In 2021, Dr. Shih’s team welcomed a valuable new member: Nvidia’s Arm HPC Developer Kit, an integrated hardware and software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications. At the core of this comprehensive solution is GIGABYTE Technology’s G242-P32, a G-Series GPU Server powered by a single ARM-based Ampere® Altra® Processor.
Its contribution to the research project has been remarkable. By Dr. Shih’s estimates, development time has been reduced by at least half, which is an efficiency boost of 200%. The scientists have taken to calling GIGABYTE’s ARM-based solution a “machine learning multicooker”—an all-in-solution that can train the AI, develop the computer model, transfer the data, and more. It is a real boon to the advancement of the traffic flow model, and it has made the team’s work considerably easier.《Glossary: What is Machine Learning?》
How has GIGABYTE’s G242-P32 and the Arm HPC DevKit been able to accomplish all this? The four main benefits can be summarized as follows: 1. ARM processors are “cloud-native”, meaning they follow the same RISC architecture as the computer chips used in the roadside devices. 2. The Ampere® Altra® CPU has an immense number of cores—up to 80 in a single processor, making it eminently suitable for parallel computing. 3. The DevKit is outfitted with dual NVIDIA® A100 GPUs, which complement the CPU through a process known as heterogeneous computing. What’s more, the 8-channel 512G DDR4 memory provides the necessary bandwidth to handle the high data transfer rate. 4. The ARM solution observes the ISO 26262 safety standards, which means computer models developed with ARM can be easily deployed by companies and institutes in the auto industry.
Benefit #1: Cloud-native ARM Processors Based on the RISC Architecture
The roadside sensor packets use industrial PCs (IPCs) that run on ARM processors, which follow the RISC instruction set architecture (ISA, or architecture for short). This is nothing out of the ordinary: due to their lower power draw and better energy efficiency, ARM processors are widely used by mobile and edge devices, making ARM the most popular type of computer chip on the planet. However, before the introduction of the Arm HPC DevKit, the computer lab at NTU used servers that were based on conventional x86 processors, which follow the CISC architecture. The upshot of all this was that the team had to write two sets of codes—one for the roadside devices, one for the data center back home. Needless to say, this ate up precious time and made the pipeline only half as efficient as it could be.
At the heart of the Arm HPC DevKit is the GIGABYTE G242-P32. This 2U GPU Server is powered by a single Ampere® Altra® Processor, which is a cloud-native ARM CPU that follows the same ISA as most mobile and edge devices. This drastically improves the efficiency of developing computer models or programs which can be used outside of conventional data centers—such as the roads and highways of Taiwan.
The Arm HPC DevKit changed all this. Because GIGABYTE’s G242-P32 runs on an ARM processor, the end result is that at last, the roadside IPCs used for testing and the server used to develop the model spoke the same “language”! There was no longer any need to write two sets of codes; the same RISC code utilized in the field could be leveraged in the data center. What’s more, because the finished product will be deployed on ARM-based roadside devices, it is imperative that the ISA of the development environment matches that of the application environment.
“We used a compiler to translate our programs from the original x86 system to the new ARM system, but we were able to complete 90% to 95% of the transfer within a month,” says Dr. Shih. “The new ARM solution has greatly improved our DevOps process and reduced our development time by at least half.”
Benefit #2: An Incredibly High Core Count that’s Ideal for Parallel Computing
The single-socket Ampere® Altra® Processor inside the G242-P32 has another distinct advantage: it boasts an unusually high number of cores, up to 80 in a single CPU. In comparison, top-line x86 processors currently max out at around 64 cores per processor. Having the workload distributed to a greater number of smaller, more energy-efficient cores is one of the reasons why ARM generally offers better performance per watt of power. This is a game-changer for the development of the “high-precision traffic flow model”, because it benefits substantially from task parallelism.
Dr. Shih explains: “In the roughly two hours it takes to finish a batch of roadside tests, we usually generate around 360 gigabytes of raw data. We run twelve programs simultaneously and connect to fifty separate files to compare, calibrate, and compile the data points. Our old system did not have enough cores to support this, and it caused my team quite a headache.”
ARM processors feature another advantage that Dr. Shih’s team has not had the chance to use yet, but it is worth mentioning as an addendum. This feature is known as “ARM big.LITTLE”. It is a combination of slower, more energy-efficient cores (“LITTLE”) with high-performing, energy-intensive cores (“big”). In effect, this configuration provides users with a wider range of tools to choose from for different tasks, so that the ARM processor can provide even better power saving without sacrificing an iota of its performance.
Benefit #3: Powerful GPUs and High-Bandwidth Memory
In computing, as in real life, sometimes you can’t make it on your own. No matter how powerful a CPU is, a little help from GPUs (or in this case, GPGPUs) can go a long way towards delivering a truly stellar performance. Fortunately, Arm HPC DevKits come standard with up to two NVIDIA® A100 PCIe Gen4 GPU cards. The 8-channel memory architecture of the G242-P32, which can support up to 512 gigabytes of DDR4 SDRAM, completes the set-up.
Data collected by roadside sensors are sent back to the NTU lab for comparison, calibration, and compilation. Dr. Shih’s team employs a method known as heterogeneous computing to achieve optimal synergy between the CPU and the GPUs, so they can develop the “high-precision traffic flow model” more efficiently.
Once the field data has been delivered back to the computer lab on campus, the ARM CPU and the NVIDIA® GPUs are given assignments best suited to their strengths. The ARM processor handles sequence calibration and comparison, while the GPUs deal with graphical input. This is done to make sure data collected by different sensors—including photographs captured by the cameras and point clouds generated by the lidars—are precisely aligned with one another, down to the millisecond. A margin of error of just 50 to 100 milliseconds (one-twentieth or one-tenth of a second) can equate to a disparity of multiple centimeters—in other words, the difference between a close call and a deadly collision. There is no room for this kind of error in a smart traffic solution, which is why captured images must sometimes be re-identified to correctly track the trajectory of individual vehicles—another task that the GPUs excel at.《Glossary: What is Point Cloud?》
Before moving on, a word about the 8-channel 512G DDR4 memory is in order. A lot of data is being whisked back and forth between the CPU and GPUs during the comparison process. The team’s older system did not have enough memory to keep up, so the researchers had to break the data down into smaller bites, which was time-consuming and inefficient. Having the memory bandwidth to match the processors’ computing speed is key: to use a car-related metaphor, a fleet of Formula One racers would only be able to do so much if all they had was a single-lane highway. The memory upgrade was another way that the Arm HPC DevKit accelerated the development of the “high-precision traffic flow model”.
Benefit #4: Adherence to ISO 26262 Makes ARM Solutions Easier to Deploy
Last but not least, it’s worth noting that ARM products are geared toward functional safety, so they are purposely designed “out of context” to satisfy the widest range of applications. What this means is that many ARM solutions meet international safety standards out-of-the-box. Arm Ltd., the main driving force behind ARM technology, plays an active role in supporting customers and manufacturers in the certification processes of ARM-based devices. As it pertains to the NTU team’s project, once the finished computer model is finalized and commercialized, it will be easier for end-users to deploy, because it had been developed with an ARM solution that observes ISO 26262, the international standard for the functional safety of electrical and/or electronic systems that are installed in serial production road vehicles.
GIGABYTE Technology is glad to contribute to the development of the NTU team’s smart traffic solution through the Arm HPC Developer Kit and the G242-P32 GPU Server. ARM processors, which have long been the favorite of mobile and edge devices, are making a comeback in the server sector in a big way. In addition to the advantages listed above, ARM processors also offer superb TCO (total cost of ownership), optimal thermal management, and incredible scalability—all of these are perks that data centers around the world can benefit from. This is in line with GIGABYTE Technology’s efforts toward developing innovative solutions for different vertical sectors, in the hopes that groundbreaking new inventions—such as a high-precision traffic flow model that can be seen as the driving force behind smarter, safer transportation—will soon become reality and help to “Upgrade Your Life”.