Benchmarks

The performance analytics of Colossal-AI demonstrate that our software is the fastest and most cost efficient solution for your deep learning infrastructure needs.

Top performance results in a nutshell

10x

faster training time

50%

inference acceleration

14x

larger batch sizes

11x

lower GPU memory consumption

24x

larger model size on same hardware

50%

longer sequence length

50%

saved GPU resources

45%

fine-tuning speedup

1 GPU

suffices for model development

Best performance in its class

Colossal-AI delivers the best performance and provides significant cost savings compared to default framework configurations and to its competitors.



	PyTorch is a machine learning framework for Python.	120x larger model sizes

	Microsoft DeepSpeed is a deep learning optimization library.	3x higher throughput

	NVIDIA NeMo Megatron is a framework to build and deploy LLMs.	5x faster training

Training benchmarks

Colossal-AI supports you in every machine learning model training process in which a machine learning algorithm is fed with sufficient training data to learn from.

ViT model

The Vision Transformer, or ViT, is a model for image classification that employs a Transformer-like architecture over patches of the image. ViTs are being adopted in a wide range of computer vision tasks, from image classification to object detection and segmentation.

Colossal-AI vs. Megatron

Achieve 14x larger batch sizes with Colossal-AI
and 5x faster training for ViT

GPT-3 model

Using text on the internet, GPT-3 is trained to generate realistic human text. GPT-3 has been used to create articles, poetry, stories, news reports and dialogue using just a small amount of input text that can be used to produce large amounts of quality copy.

Colossal-AI vs. Megatron

You can save 50% of your GPU resources with Colossal-AI
and achieve a 10.7% acceleration

GPT-2 model

GPT-2 is an unsupervised deep learning transformer-based language model created by OpenAI back in February 2019 for the single purpose of predicting the next word(s) in a sentence. GPT-2 is an acronym for “Generative Pretrained Transformer 2”.

Colossal-AI vs. Megatron

You benefit from a 11x lower GPU memory consumption with Colossal-AI
and superlinear scaling efficiency due to its tensor parallelism

Colossal-AI vs. PyTorch

Scale to a 24x larger model size on the same hardware

Colossal-AI vs. DeepSpeed

Benefit from a 3x speedup on the same computing devices

BERT model

BERT is an open source machine learning framework for natural language processing (NLP). BERT is designed to help computers understand the meaning of ambiguous language in text by using surrounding text to establish context.

Colossal-AI vs. Megatron

Colossal-AI propels you to a 2x faster training
or enables you to run a 50% longer sequence length

Inference benchmarks

Benefit from Colossal-AI during model inference by processing unseen data using the trained model at higher speed and larger scale to produce accurate results. Colossal-AI provides inference optimization functionality through the Energon-AI extension.

GPT-3 model

Colossal-AI vs. NVIDIA FastTransformer

Colossal-AI enables you to achieve 50% inference acceleration on the same hardware infrastructure

GPT-3 Inference, Padding = 128, TP = 2 grey bg

GPT-3 Inference, Padding = 128, TP = 4 grey bg

Fine-tuning benchmarks

Take a model that has already been trained for one given task and tune or tweak the model faster with Colossal-AI to make it perform a second similar task.

OPT model

Meta AI has introduced a large language model trained on billions of parameters called OPT (Open Pre-trained Transformers). It can be used to generate creative text, solve simple math problems, answer reading comprehension questions.

Colossal-AI vs. DeepSpeed

With Colossal-AI, a 45% speedup fine-tuning OPT is possible
at low cost in lines

Don't wait, accelerate!

Speed up and scale deep learning with Colossal-AI.

Talk to an expert