Myelintek MLSteam DNN Training System

Introduction

GIGABYTE’s DNN Training Appliance is a well-integrated software and hardware package that combines powerful computing performance together with a user-friendly GUI. In turn, it provides DNN developers with an easy to use environment to conduct dataset management, training jobs management, real time system environment monitoring, and model analysis. The appliance includes powerful hardware and software optimizations that can improve the performance while reducing the time required for DNN training.

Use Case Scenarios

Smart Health

Convolutional Neural Networks (CNN) can help and optimize routine tasks for medical image analysis and disease detection, such as eye disease and brain MRI segmentation. Non-image analysis can be used, such as in epileptic seizures prediction.

Automated Traffic Enforcement

Object detection and segmentation techniques can be applied to various traffic enforcement tasks, such as license plate recognition, seat belt usage, and driver cell phone usage.

Image Recognition

The DNN Training Appliance can be used to train algorithms for image recognition - such as for people, cars or other objects, which can be used for an intelligent video analytics platform.

Intelligent Banking

Deep learning algorithms and natural lanaguage processing (NLP) can be used by banks operations, such as customer service automation (by chatbots), analyzing contracts, intelligent document search and credit scoring.

Providing Developers and Data Scientists the Following Benefits

Saves Time

Deep learning can be done faster using DNN Training Appliance vs an Open Source Community solution.

Saves Money

Achieve maximum utilization of your hardware investment with powerful optimization features, so that downtime is minimal.

Ease of Use

Faster startup of a DNN training environment for developers; spend less time and resources on employee training.

Choices & Customization

The standard version is enabled for image classification and object detection, or talk to us about a customized solution for model / application type

Reduces the Complexity of DNN Training Environment Setup and Management

To generate a production grade DNN model, a developer will need to go through many difficult and time consuming steps, including dataset collection, dataset cleansing, dataset labeling, dataset augmentation, dataset format conversion, models selection, model design, hyperparameters tuning, model training, model evaluation, and model format conversion. Each step requires different tools and configurations that require time and effort for preparation, and switching between these tools often requires additional time writing code to convert different formats to use with different tools.

GIGABYTE’s DNN Training Appliance aims to reduce this complexity by providing a complete training and management platform, while incorporating all these processes into an easy to use web-browser based GUI. Users can import, convert and manage their dataset; design, train and evaluate different DNN models; and test inferencing of their trained models. Based on GIGABYTE’s G481-HA1 server, the platform is fully optimized to use the bare metal resources available to deliver improved training performance on cost-efficient hardware.

DNN Training Appliance Hardware and Software Stack

Reduces the Time and Improves the Accuracy for Each DNN Training Job

DNN models need to be trained on a large dataset to achieve an acceptable level of accuracy. Depending on the dataset size, this training could take days or even weeks. And in order to adapt to the latest business circumstances or situations (such as new products, new regulations, etc.), the DNN model needs to be periodically retrained through the latest datasets. If running a DNN training job takes too long, it will have a serious impact on an organization’s operations, resource management, and competiveness.

GIGABYTE’s DNN Training Appliance helps to reduce training time by incorporating many different optimization features:GPU memory optimization to accommodate a large amount of training input or to fit a large model into GPU memory, automatic hyperparameter tuning (during a training job) to achieve higher accuracy, and dataset cleaning features to reduce the training time generated by mislabeled or duplicated training data.

Project table view

Training jobs view

Cloud IDE & Utilities Interface
Users can easily create a Cloud IDE (based on Jupyterlab) for DNN model development or data preprocessing by attaching their dataset. The Cloud IDE also provides utilities, such as hyperparameter passing, 3rd-party IDE integration (VSCode and PyCharm), tensorboard and GPU monitoring to simplify the training process.

Cloud IDE

Model Templates and Optimization Tutorial
GIGABYTE’s DNN Training sytem has built-in templates, guiding the user on how to train different types of models (for image classification, object detection, etc.) with various optimization techniques, such as GPU memory optimization and mixed precision training. These templates allow the user to easily choose the dataset, DNN models, and hyperparameter settings needed based on the DNN application type. Thus, the user can easily leverage templates for collaboration.

Real-time Monitoring and Quick Result Verification
Once training starts, it is possible to keep track of the progress in real-time via the training monitoring chart. After each training job is completed, you can quickly verify your DNN model with the Cloud IDE workspace.

Effective Dataset Management Tools

User Friendly File Browser

The platform provides a file browser style management interface. The user can preview image files, delete files, and download files by selecting target files. To upload files, simply drag and drop files from your PC to the dataset.

Upload files by drag and drop

Dataset Annotation Visualization
The platform supports multiple dataset annotated formats so that the user can preview the annotated dataset on the dataset page. Ex. Bounding box, segmentation images, etc.

System Monitoring and Administration

System Resource Monitoring
GIGABYTE’s DNN Training Appliance features real-time GPU (including GPU utilization, GPU memory usage, and temperature), CPU, Disk, and memory usage monitoring.

Real-time system resource monitoring

Administration Dashboard
GIGABYTE’s DNN Training Appliance features a dashboard for administration, including an audit log, training tasks overview, dataset overview, and user account management.

Create and Run Training Jobs with a Template

Optimized Hardware Platform

Single-Root GPU Server

GIGABYTE's DNN Training Appliance is built with G481-HA1, a server optimized for a single cluster DNN training appliance by employing a single root GPU system architecture. Since DNN training requires frequent communication between each GPU in the system, utilizing a single-root architecture (all GPUs can communicate via the same CPU root) helps reduce GPU to GPU latency and decrease DNN training job time.