stabilityai
/

stable-diffusion-xl-1.0-tensorrt

@@ -49,7 +49,7 @@ See the [usage instructions](#usage-example) for how to run the SDXL pipeline wi
 | A100        | 0.27 images/sec          | 0.36 images/sec             | ~33%                   |
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
-#### Timings for LCM version for 4 steps at 1024x1024
 | Accelerator | CLIP                     | Unet                        | VAE                    |Total                   |
 |-------------|--------------------------|-----------------------------|------------------------|------------------------|
@@ -84,33 +84,37 @@ pip3 install -r requirements.txt
 python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt
 ```
-4. Perform TensorRT optimized inference for the sdxl
-  * The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
-```
-python3 demo_txt2img_xl.py \
-  "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
-  --build-static-batch \
-  --use-cuda-graph \
-  --num-warmup-runs 1 \
-  --width 1024 \
-  --height 1024 \
-  --denoising-steps 30 \
-  --onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
-  --onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
-```
-4. Perform TensorRT optimized inference for the sdxl Latent Consistency Model (LCM) version
-  * The first invocation produces plan files in --engine-dir specific to the accelerator being run on and are reused for later invocations.
-```
-python3 demo_txt2img_xl.py \
-  ""Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"" \
-  --version=xl-1.0 \
-  --onnx-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm \
-  --engine-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm/engine-sdxl-lcm-nocfg \
-  --scheduler LCM \
-  --denoising-steps 4 \
-  --guidance-scale 0.0 \
-  --seed 42
-```

 | A100        | 0.27 images/sec          | 0.36 images/sec             | ~33%                   |
 | H100        | 0.40 images/sec          | 0.68 images/sec             | ~70%                   |
+#### Timings for Latent Consistency Model(LCM) version for 4 steps at 1024x1024
 | Accelerator | CLIP                     | Unet                        | VAE                    |Total                   |
 |-------------|--------------------------|-----------------------------|------------------------|------------------------|
 python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt
 ```
+4. Perform TensorRT optimized inference:
+  - **SDXL**
+    The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
+    ```
+    python3 demo_txt2img_xl.py \
+      "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
+      --build-static-batch \
+      --use-cuda-graph \
+      --num-warmup-runs 1 \
+      --width 1024 \
+      --height 1024 \
+      --denoising-steps 30 \
+      --onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
+      --onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
+    ```
+  - **SDXL-LCM**
+    The first invocation produces plan files in --engine-dir specific to the accelerator being run on and are reused for later invocations.
+    ```
+    python3 demo_txt2img_xl.py \
+      ""Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"" \
+      --version=xl-1.0 \
+      --onnx-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm \
+      --engine-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm/engine-sdxl-lcm-nocfg \
+      --scheduler LCM \
+      --denoising-steps 4 \
+      --guidance-scale 0.0 \
+      --seed 42
+    ```