stabilityai
/

stable-diffusion-xl-1.0-tensorrt

@@ -7,6 +7,7 @@ tags:
   - stable-diffusion
   - stable-diffusion-xl
   - stable-diffusion-xl-lcm
   - tensorrt
   - text-to-image
 ---
@@ -15,12 +16,10 @@ tags:
 ## Introduction
-This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
 See the [usage instructions](#usage-example) for how to run the SDXL pipeline with the ONNX files hosted in this repository.
-Usage instructions for the LCM version of sdxl [here](lcm/README.md)
 ![examples](./examples.jpg)
@@ -50,18 +49,25 @@ Usage instructions for the LCM version of sdxl [here](lcm/README.md)
 | A100        | 0.27 images/sec          | 0.36 images/sec             | ~33%                   |
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
 ## Usage Example
-1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) on launching a TensorRT NGC container.
 ```shell
 git clone https://github.com/rajeevsrao/TensorRT.git
 cd TensorRT
-git checkout release/8.6
-docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.06-py3 /bin/bash
 ```
-2. Download the SDXL TensorRT files from this repo
 ```shell
 git lfs install
 git clone https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt
@@ -72,14 +78,13 @@ cd ..
 3. Install libraries and requirements
 ```shell
-python3 -m pip install --upgrade pip
-python3 -m pip install --upgrade tensorrt
 cd demo/Diffusion
 pip3 install -r requirements.txt
 ```
-4. Perform TensorRT optimized inference
   * The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
 ```
@@ -93,4 +98,19 @@ python3 demo_txt2img_xl.py \
   --denoising-steps 30 \
   --onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
   --onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
 ```

   - stable-diffusion
   - stable-diffusion-xl
   - stable-diffusion-xl-lcm
+  - stable-diffusion-xl-lcmlora
   - tensorrt
   - text-to-image
 ---
 ## Introduction
+This repository hosts the TensorRT versions(sdxl, sdxl-lcm, sdxl-lcmlora) of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
 See the [usage instructions](#usage-example) for how to run the SDXL pipeline with the ONNX files hosted in this repository.
 ![examples](./examples.jpg)
 | A100        | 0.27 images/sec          | 0.36 images/sec             | ~33%                   |
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
+#### Timings for LCM version for 4 steps at 1024x1024
+| Accelerator | CLIP                     | Unet                        | VAE                    |Total                   |
+|-------------|--------------------------|-----------------------------|------------------------|------------------------|
+| A100        | 1.08 ms                  | 192.02 ms                   | 228.34 ms              | 426.16 ms              |
+| H100        | 0.78 ms                  | 102.8 ms                    | 126.95 ms              | 234.22 ms              |
 ## Usage Example
+1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/9.2/demo/Diffusion/README.md) on launching a TensorRT NGC container.
 ```shell
 git clone https://github.com/rajeevsrao/TensorRT.git
 cd TensorRT
+git checkout release/9.2
+docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.11-py3 /bin/bash
 ```
+2. Download the SDXL LCM TensorRT files from this repo
 ```shell
 git lfs install
 git clone https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt
 3. Install libraries and requirements
 ```shell
 cd demo/Diffusion
+python3 -m pip install --upgrade pip
 pip3 install -r requirements.txt
+python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt
 ```
+4. Perform TensorRT optimized inference for the sdxl
   * The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
 ```
   --denoising-steps 30 \
   --onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
   --onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
+```
+4. Perform TensorRT optimized inference for the sdxl Latent Consistency Model (LCM) version
+  * The first invocation produces plan files in --engine-dir specific to the accelerator being run on and are reused for later invocations.
+```
+python3 demo_txt2img_xl.py \
+  ""Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"" \
+  --version=xl-1.0 \
+  --onnx-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm \
+  --engine-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm/engine-sdxl-lcm-nocfg \
+  --scheduler LCM \
+  --denoising-steps 4 \
+  --guidance-scale 0.0 \
+  --seed 42
 ```