中文 / EN

Empowered, Inspired and Enabled by AI

Task-level Time-sharing Computing Power

AI Driven Hedge Fund

AI fundamental research

萤火

Computing Power on Demand

Fire-Flyer II is a deep learning platform constructed by High-Flyer AI

Based on its highly responsive time-sharing task scheduler, Fire-Flyer II gives each researcher a smooth training experience. With the help of powerful software layer, the users can scale their models up to fully utilize all the GPUs at their fingertip with the help of a collection of highly optimized DL primitives for model acceleration (hfai.nn), a communication framework for distributed training (hfreduce), and a large-capacity high-throughput parallel file system for reading samples (3fs).

hfai.nn Operator library
hfreduce
3FS

DL primitives optimized by committed medal winners 20% to 6 times acceleration for LSTM layer 30% acceleration for Attention layer

Reference >

allreduce optimized for specially designed computing nodes good performance on other hardware too 20% acceleration for BERT-Large model trained with 100 nodes

Reference >

self-developed parallel file system reaching the limit of hardware: High Bandwidth and SSD alike IO throughput: 1.8 Tiops IO bandwidth: 7.0 TBps

Reference >

96 %

Cluster Util

85 %

GPU Util

7.0TB/sRead

500GB/sWrite

The above data is based on cluster usage statistics in Feb. 2022

Apply for Service