H100 PCIe GPU instance

Accelerate your model training and inference with the most high-end AI chip of the market!

Fine tune models like LLaMA 2

Optimize Transformers Models and LLMs through efficient processes, and accelerate the training of larger models with the cutting-edge Tensor Cores 4th generation technology and the latest 8-bit data format.

Accelerate inference workloads up to 30 times

Accelerate your model serving workloads thanks to Transformer Engine 30x faster for AI inference and new data formats.

Maximize GPU utility up to your needs

With 2nd generation of Secure MIG (multi-instance GPU), partition the GPU into isolated, right-size instances to maximize utilization for the smallest to biggest multi-GPU jobs.

Available zones:
Paris:PAR 2

H100 PCIe GPU technical specifications

  • GPUNVIDIA H100 PCIe Tensor Core

  • GPU Memory80GB HBM2e

  • Processor24 vCPUs AMD Epyc Zen 4

  • Processor frequency2.7 Ghz

  • Memory240 GB of RAM

  • Memory typeDDR5

  • Bandwidth10 Gbps

  • StorageBlock Storage for the boot and 3TB of Scratch Storage NVMe

  • Get up to 25% off of your H100 PCIe GPU instance price by committing upfront

    Talk with an expert today

    Numerous AI applications and use cases

    Natural Language Processing

    Understands, interprets, and generates human language in a way that is both meaningful and contextually relevant.
    Thanks to models and algorithms specialized in:

    • Text classification
    • Machine translation
    • Entailment prediction
    • Named entity recognition
    • Sequence-to-sequence, like BERT for text extraction
    • Text similarity search, like BERT to find semantic similarities
    • Language modeling

    Choose your instance's format

    Instance Name
    Number of GPU
    TFLOPs FP16 Tensor Cores
    starting at


    1 H100 PCIe Tensor Core

    Up to 1,513 teraFLOPS




    2 H100 PCIe Tensor Core

    Up to 3,026 teraFLOPS

    2 x 80GB


    Enjoy the simplicity of a pre-configured AI environment

    Optimized GPU OS Image

    Benefit from a ready-to-use Ubuntu image to launch your favorite deep learning containers (pre-installed NVIDIA driver and Docker environment).

    Learn more

    Enjoy your favorite Jupyter environment

    Easily launch your favorite JupyterLab or Notebook thanks to the pre-installed Docker environment

    Learn more

    Choose your AI containers among multiple registries

    Access multiple container registries: your own build containers, Scaleway AI containers, NVIDIA NGC registry and any other registry

    Learn more

    NVIDIA Enterprise AI software at your disposal

    Access hundreds of AI softwares optimized by Nvidia to maximise the efficiency of your GPUs and boost your productivity. Among hundreds of softwares developed by NVIDIA and tested by leaders of their industry, harness the efficiency of

    • NVIDIA Nemo for LLM fine-tuning,
    • NVIDIA TAO for computer vision,
    • NVIDIA TRITON for inference
    Learn more

    Deploy and Scale your infrastructure with Kubernetes

    Frequently asked questions

    3TB of Scratch Storage are included in the instance price, but any Block Storage provisioned by you, is at your expense.
    For redundancy and thus security reasons we strongly recommend that you provision extra Block Storage volume, as Scratch Storage is ephemeral storage that disappears when you switch off the machine. Scratch Storage purpose is to speed up the transfer of your data sets to the gpu.
    How to use Scratch storage then? Follow the guide

    These are 2 formats of the same instance embedding NVIDIA H100 PCIe Tensor Core.

    • H100-1-80G embeds 1 GPU NVIDIA H100 PCIe Tensor Core, offering a GPU memory of 80GB
    • H100-2-80G embeds 2 GPUs NVIDIA H100 PCIe Tensor Core, offering a GPU memory of 2 times 80GB. This instance enables faster time to train for bigger Transformers models that scale 2 GPUs at a time. Thanks to the PCIe Board Factor, the servers of the H100 PCIe GPU instance are made of 2 GPUs. By launching a H100-2-80G instance format, the user benefits from a fully dedicated server with 2 GPUs.

    NVIDIA announced the H100 to enable companies to slash costs for deploying AI, "delivering the same AI performance with 3.5x more energy efficiency and 3x lower total cost of ownership, while using 5x fewer server nodes over the previous generation."
    What inside the product can confirm this announcement?

    • The thinner engraving of the chip reduces the surface and thus the energy required to power the chip
    • Thanks to innovations like the new data format FP8 (8bits) more calculations are done with the same amount of consumption resulting in time and energy optimization

    In addition, at Scaleway we decided to localize our H100 PCIe instances in the adiabatic Data Center DC5. With a PUE (Power User Effectiveness) of 1.15 (average is usually 1.6) this datacenter saves between 30% and 50% electricity compared with a conventional data centre.
    Stay tuned for our benchmarks on the topic!

    NVIDIA Multi-Instance GPU (MIG) is a technology introduced by NVIDIA to enhance the utilization and flexibility of their data center GPUs, specifically designed for virtualization and multi-tenant environments. It allows a single physical GPU to be partitioned into up to seven smaller Instances, each of which operates as an independent MIG partition with its own dedicated resources, such as memory, compute cores, and video outputs.
    Read the dedicated documentation to use MIG technology on your GPU instance

    There are many criteria to take into account to choose the right GPU instance:

    • Workload requirements
    • Performance requirements
    • GPU type
    • GPU memory
    • CPU and RAM
    • GPU driver and software compatibility
    • Scaling

    For more guidance read the dedicated documentation on that topic