OpenAI GPT-OSS 120B & 20B is Live on HPC-AI.COM!

Written by Team | Aug 6, 2025 11:36:45 AM

OpenAI has launched gpt-oss-120b and gpt-oss-20b—powerful open-weight language models optimized for reasoning, tool use, and efficient deployment on consumer hardware.

These models are released under the Apache 2.0 license, one of the most flexible open-source licenses available, so you can integrate and scale your projects freely.

Now you can access OpenAI GPT-OSS 120B & 20B instantly on HPC-AI.COM!

Here’s how to get started in just a few steps:

Environment Setup

We provide preconfigured, high-performance machine learning environments, so you can hit the ground running. Just choose an image—such as CUDA 12.8—and your GPU instance will be ready in minutes.

Next, install the inference framework of your choice. For example, follow the vLLM setup guide to get started quickly. For detailed documentation, please refer to GPT-OSS vLLM Usage Guide.

pip install uv
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install --pre vllm==0.10.1+gptoss \
--extra-index-url https://wheels.vllm.ai/gpt-oss/ \
--extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
--index-strategy unsafe-best-match

Access GPT-OSS in 5 Minutes

We provide cluster-level caching that enables incredibly fast model loading: the entire 183GB model file can be downloaded within 5 minutes at speeds up to 1.1GB/s. With dedicated data disks and high-speed shared storage configured for you, running GPT-OSS in a private environment is effortless.

Here is how you can use GPT-OSS-120B models on HPC-AI.COM:

cd ${YourModelPath}
#!/bin/bash
export model="openai/gpt-oss-120b"
curl -sSL https://d.juicefs.com/install | sh -
juicefs sync minio://minio:minio123@minio:9000/hf-model/${model}/ ./${model}/

Deploy Your Inference Service

We provide a public network forwarding service that allows you to expose your inference service to the internet. You can set up an HTTP port forwarding service through the launch or configuration interface, enabling your service to be accessible in a public environment.

For more details, including a comprehensive guide and interesting examples of GPT-OSS, please refer to the docs: https://hpc-ai.com/doc/docs/tutorial/gpt-oss

Additionally, we have increased the supply of H200 GPUs in our US cluster and are now offering them at the lowest price of just $2.19/hour — so you can start using GPT-OSS right away!

Subscribe to stay updated — we’ll soon release performance benchmarks for vLLM and SGLang on GPT-OSS to help you better evaluate and compare.

Reference:

https://openai.com/index/introducing-gpt-oss/

View full post

OpenAI GPT-OSS 120B & 20B is Live on HPC-AI.COM!

Environment Setup

Access GPT-OSS in 5 Minutes

Deploy Your Inference Service