Hub documentation

Advanced Compute Options

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Advanced Compute Options

This feature is part of the Enterprise Hub.

Enterprise Hub organizations gain access to advanced compute options to accelerate their machine learning journey.

Host ZeroGPU Spaces in your organization

ZeroGPU is a dynamic GPU allocation system that optimizes AI deployment on Hugging Face Spaces. By automatically allocating and releasing NVIDIA A100 GPUs (40GB VRAM) as needed, organizations can efficiently serve their AI applications without dedicated GPU instances.

screenshot of Hugging Face Advanced Compute Options (ZeroGPU)

Key benefits for organizations

  • Free GPU Access: Access powerful NVIDIA A100 GPUs at no additional cost through dynamic allocation
  • Enhanced Resource Management: Host up to 50 ZeroGPU Spaces for efficient team-wide AI deployment
  • Simplified Deployment: Easy integration with PyTorch-based models, Gradio apps, and other Hugging Face libraries
  • Enterprise-Grade Infrastructure: Access to high-performance NVIDIA A100 GPUs with 40GB VRAM per workload

Learn more about ZeroGPU →

Train on NVIDIA DGX Cloud

Train on NVIDIA DGX Cloud offers a simple no-code training job creation experience powered by Hugging Face AutoTrain and Hugging Face Spaces. Instantly access NVIDIA GPUs and avoid the time-consuming work of writing, testing, and debugging training scripts for AI models.

How it works

Read the blogpost for Train on NVIDIA DGX Cloud.

Supported architectures

Transformers

Architecture
Llama
Falcon
Mistral
Mixtral
T5
gemma

Diffusers

Architecture
Stable Diffusion
Stable Diffusion XL

Pricing

Usage of Train on NVIDIA DGX Cloud is billed by the minute of the GPU instances used during your training jobs. Usage fees accrue to your Enterprise Hub Organizations’ current monthly billing cycle, once a job is completed. You can check your current and past usage at any time within the billing settings of your Enterprise Hub Organization.

NVIDIA GPU GPU Memory On-Demand Price/hr
NVIDIA L40S 48GB $2.75
NVIDIA H100 80GB $8.25

NVIDIA NIM API (serverless)

NVIDIA NIM API (serverless) offers access to NVIDIA Inference Microservices (NIM) powered by NVIDIA H100s in a serverless way. Use standardized APIs and a few lines of code to run inference in a pay-as-you-go pricing model.

How it works

Read the blogpost for Serverless Inference with Hugging Face and NVIDIA NIMs.

Supported models

You can find all supported models in this NVIDIA Collection.

Pricing

Usage of NVIDIA NIM API (serverless) is billed based on the compute time spent per request. Usage fees accrue to your Enterprise Hub Organizations’ current monthly billing cycle, once a job is completed. You can check your current and past usage at any time within the billing settings of your Enterprise Hub Organization.

NVIDIA GPU GPU Memory On-Demand Price/hr
NVIDIA H100 80GB $8.25

The total cost for a request will depend on the model size, the number of GPUs required, and the time taken to process the request. For each model, you can find which hardware configuration is used in the notes of this NVIDIA Collection.

< > Update on GitHub