邁爾凌MLSteam深度學習訓練解決方案

技嘉的深度學習解決方案結合強大的運算效能與圖形化操作介面,為深度學習工程師提供簡易操作環境,執行數據集管理、深度學習訓練排程管理、即時監控和訓練模型分析。
Download White Paper
Introduction
GIGABYTE’s DNN Training Appliance is a well-integrated software and hardware package that combines powerful computing performance together with a user-friendly GUI. In turn, it provides DNN developers with an easy to use environment to conduct dataset management, training jobs management, real time system environment monitoring, and model analysis. The appliance includes powerful hardware and software optimizations that can improve the performance while reducing the time required for DNN training.
Use Case Scenarios
Smart Health
Convolutional Neural Networks (CNN) can help and optimize routine tasks for medical image analysis and disease detection, such as eye disease and brain MRI segmentation. Non-image analysis can be used, such as in epileptic seizures prediction.
Automated Traffic Enforcement
Object detection and segmentation techniques can be applied to various traffic enforcement tasks, such as license plate recognition, seat belt usage, and driver cell phone usage.
Image Recognition
The DNN Training Appliance can be used to train algorithms for image recognition - such as for people, cars or other objects, which can be used for an intelligent video analytics platform.
Intelligent Banking
Deep learning algorithms and natural lanaguage processing (NLP) can be used by banks operations, such as customer service automation (by chatbots), analyzing contracts, intelligent document search and credit scoring.
Providing Developers and Data Scientists the Following Benefits
Saves Time
Deep learning can be done faster using DNN Training Appliance vs an Open Source Community solution.
Saves Money
Achieve maximum utilization of your hardware investment with powerful optimization features, so that downtime is minimal.
Ease of Use
Faster startup of a DNN training environment for developers; spend less time and resources on employee training.
Choices & Customization
The standard version is enabled for image classification and object detection, or talk to us about a customized solution for model / application type
Reduces the Complexity of DNN Training Environment Setup and Management
To generate a production grade DNN model, a developer will need to go through many difficult and time consuming steps, including dataset collection, dataset cleansing, dataset labeling, dataset augmentation, dataset format conversion, models selection, model design, hyperparameters tuning, model training, model evaluation, and model format conversion. Each step requires different tools and configurations that require time and effort for preparation, and switching between these tools often requires additional time writing code to convert different formats to use with different tools.

GIGABYTE’s DNN Training Appliance aims to reduce this complexity by providing a complete training and management platform, while incorporating all these processes into an easy to use web-browser based GUI. Users can import, convert and manage their dataset; design, train and evaluate different DNN models; and test inferencing of their trained models. Based on GIGABYTE’s G481-HA1 server, the platform is fully optimized to use the bare metal resources available to deliver improved training performance on cost-efficient hardware.
DNN Training Appliance Hardware and Software Stack
Reduces the Time and Improves the Accuracy for Each DNN Training Job
DNN models need to be trained on a large dataset to achieve an acceptable level of accuracy. Depending on the dataset size, this training could take days or even weeks. And in order to adapt to the latest business circumstances or situations (such as new products, new regulations, etc.), the DNN model needs to be periodically retrained through the latest datasets. If running a DNN training job takes too long, it will have a serious impact on an organization’s operations, resource management, and competiveness.

GIGABYTE’s DNN Training Appliance helps to reduce training time by incorporating many different optimization features:GPU memory optimization to accommodate a large amount of training input or to fit a large model into GPU memory, automatic hyperparameter tuning (during a training job) to achieve higher accuracy, and dataset cleaning features to reduce the training time generated by mislabeled or duplicated training data.
Project table view
Training jobs view
Cloud IDE & Utilities Interface
Users can easily create a Cloud IDE (based on Jupyterlab) for DNN model development or data preprocessing by attaching their dataset. The Cloud IDE also provides utilities, such as hyperparameter passing, 3rd-party IDE integration (VSCode and PyCharm), tensorboard and GPU monitoring to simplify the training process.
Cloud IDE
Model Templates and Optimization Tutorial
Gigabyte’s DNN Training sytem has built-in templates, guiding the user on how to train different types of models (for image classification, object detection, etc.) with various optimization techniques, such as GPU memory optimization and mixed precision training. These templates allow the user to easily choose the dataset, DNN models, and hyperparameter settings needed based on the DNN application type. Thus, the user can easily leverage templates for collaboration.

Real-time Monitoring and Quick Result Verification
Once training starts, it is possible to keep track of the progress in real-time via the training monitoring chart. After each training job is completed, you can quickly verify your DNN model with the Cloud IDE workspace.
Effective Dataset Management Tools
User Friendly File Browser
The platform provides a file browser style management interface. The user can preview image files, delete files, and download files by selecting target files. To upload files, simply drag and drop files from your PC to the dataset.
Upload files by drag and drop
Dataset Annotation Visualization
The platform supports multiple dataset annotated formats so that the user can preview the annotated dataset on the dataset page. Ex. Bounding box, segmentation images, etc.
System Monitoring and Administration
System Resource Monitoring
GIGABYTE’s DNN Training Appliance features real-time GPU (including GPU utilization, GPU memory usage, and temperature), CPU, Disk, and memory usage monitoring.
Real-time system resource monitoring
Administration Dashboard
GIGABYTE’s DNN Training Appliance features a dashboard for administration, including an audit log, training tasks overview, dataset overview, and user account management.
Create and Run Training Jobs with a Template
Optimized Hardware Platform
Single-Root GPU Server
GIGABYTE's DNN Training Appliance is built with G481-HA1, a server optimized for a single cluster DNN training appliance by employing a single root GPU system architecture. Since DNN training requires frequent communication between each GPU in the system, utilizing a single-root architecture (all GPUs can communicate via the same CPU root) helps reduce GPU to GPU latency and decrease DNN training job time.
Build Your AI Innovations Ever Faster and Simpler
GIGABYTE Servers as the Hardware Base of DNN
1/8
G492-ID0
HPC/AI Server - 3rd Gen Intel® Xeon® Scalable - 4U DP HGX™ A100 8-GPU
2/8
G262-ZO0
HPC/AI Server - AMD EPYC™ 7003 - 2U DP Instinct™ MI250 4-GPU | Application: 人工智慧平台 , 人工智慧訓練伺服器 , 人工智慧推論伺服器 & 高效能運算伺服器
3/8
H262-Z6B
HCI Server - AMD EPYC™ 7002 - 2U 4 Node DP 8-Bay Gen4 NVMe/SATA | Application: 超融合伺服器 & 私有雲/混和雲伺服器
5/8
G492-PD0
HPC/AI Arm Server - Ampere® Altra® Max - 4U UP HGX™ A100 8-GPU
6/8
DNN Training Appliance 1
NVLink 8 x SXM2 V100 GPGPU Server
7/8
DNN Training Appliance 2
Single Root 8 / 10 x PCIe GPU Server
8/8
DNN Training Appliance 3
AMD EPYC 8 x PCIe Gen 4.0 GPU Server
Related Technologies
機器學習是什麼? 機器學習(Machine Learning) 是電腦系統使用演算法和統計模型來有效執行特定任務的科學研究,無需使用明確的指令,而是依靠模型(models)和推論(inference)。它被視為人工智慧的一個子集。
推論引擎是什麼? 在人工智慧領域中, 推論引擎(Inference Engine)是系统中的一個组件,它將現有的資訊經過邏輯性規則推論,再應用到新知識領域。 推論引擎將邏輯性規則實際應用到知識庫,通常稱之為IF-THEN規則。
人工智慧(AI)是什麼? 人工智慧(Artificial Intelligence)是電腦科學的一個廣泛分支。人工智慧的目標是創造出具有智慧功能獨立運行的機器,並且擁有像人類一樣的工作能力及反應。為了達成這些目的,機器、軟體及各種應用程序運用了和人類相同的方法去獲取智慧 - 通過保存記憶資訊並隨著時間的演進變得更聰明。 人工智慧不是一個新概念,這個想法自1950年代起就一直開始備受討論,但由於近代電腦技術的進步 ─ 例如我們現在具備了蒐集大量資訊並儲存的能力,得以獲取足夠的數據量,讓現實中得以實現機器學習的開發,加上硬體處理速度和運算能力的快速提升,這使得處理蒐集的數據用於訓練機器/應用程序並使其“更智慧”的目標成真。
加速實現你的科技創新
業務洽詢