Update Readme for lcm
Browse files
README.md
CHANGED
@@ -7,6 +7,7 @@ tags:
|
|
7 |
- stable-diffusion
|
8 |
- stable-diffusion-xl
|
9 |
- stable-diffusion-xl-lcm
|
|
|
10 |
- tensorrt
|
11 |
- text-to-image
|
12 |
---
|
@@ -15,12 +16,10 @@ tags:
|
|
15 |
|
16 |
## Introduction
|
17 |
|
18 |
-
This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
|
19 |
|
20 |
See the [usage instructions](#usage-example) for how to run the SDXL pipeline with the ONNX files hosted in this repository.
|
21 |
|
22 |
-
Usage instructions for the LCM version of sdxl [here](lcm/README.md)
|
23 |
-
|
24 |
|
25 |
![examples](./examples.jpg)
|
26 |
|
@@ -50,18 +49,25 @@ Usage instructions for the LCM version of sdxl [here](lcm/README.md)
|
|
50 |
| A100 | 0.27 images/sec | 0.36 images/sec | ~33% |
|
51 |
| H100 | 0.40 images/sec | 0.68 images/sec | ~70% |
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
|
54 |
## Usage Example
|
55 |
|
56 |
-
1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/
|
57 |
```shell
|
58 |
git clone https://github.com/rajeevsrao/TensorRT.git
|
59 |
cd TensorRT
|
60 |
-
git checkout release/
|
61 |
-
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.
|
62 |
```
|
63 |
|
64 |
-
2. Download the SDXL TensorRT files from this repo
|
65 |
```shell
|
66 |
git lfs install
|
67 |
git clone https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt
|
@@ -72,14 +78,13 @@ cd ..
|
|
72 |
|
73 |
3. Install libraries and requirements
|
74 |
```shell
|
75 |
-
python3 -m pip install --upgrade pip
|
76 |
-
python3 -m pip install --upgrade tensorrt
|
77 |
-
|
78 |
cd demo/Diffusion
|
|
|
79 |
pip3 install -r requirements.txt
|
|
|
80 |
```
|
81 |
|
82 |
-
4. Perform TensorRT optimized inference
|
83 |
* The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
|
84 |
|
85 |
```
|
@@ -93,4 +98,19 @@ python3 demo_txt2img_xl.py \
|
|
93 |
--denoising-steps 30 \
|
94 |
--onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
|
95 |
--onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
96 |
```
|
|
|
7 |
- stable-diffusion
|
8 |
- stable-diffusion-xl
|
9 |
- stable-diffusion-xl-lcm
|
10 |
+
- stable-diffusion-xl-lcmlora
|
11 |
- tensorrt
|
12 |
- text-to-image
|
13 |
---
|
|
|
16 |
|
17 |
## Introduction
|
18 |
|
19 |
+
This repository hosts the TensorRT versions(sdxl, sdxl-lcm, sdxl-lcmlora) of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
|
20 |
|
21 |
See the [usage instructions](#usage-example) for how to run the SDXL pipeline with the ONNX files hosted in this repository.
|
22 |
|
|
|
|
|
23 |
|
24 |
![examples](./examples.jpg)
|
25 |
|
|
|
49 |
| A100 | 0.27 images/sec | 0.36 images/sec | ~33% |
|
50 |
| H100 | 0.40 images/sec | 0.68 images/sec | ~70% |
|
51 |
|
52 |
+
#### Timings for LCM version for 4 steps at 1024x1024
|
53 |
+
|
54 |
+
| Accelerator | CLIP | Unet | VAE |Total |
|
55 |
+
|-------------|--------------------------|-----------------------------|------------------------|------------------------|
|
56 |
+
| A100 | 1.08 ms | 192.02 ms | 228.34 ms | 426.16 ms |
|
57 |
+
| H100 | 0.78 ms | 102.8 ms | 126.95 ms | 234.22 ms |
|
58 |
+
|
59 |
|
60 |
## Usage Example
|
61 |
|
62 |
+
1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/9.2/demo/Diffusion/README.md) on launching a TensorRT NGC container.
|
63 |
```shell
|
64 |
git clone https://github.com/rajeevsrao/TensorRT.git
|
65 |
cd TensorRT
|
66 |
+
git checkout release/9.2
|
67 |
+
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.11-py3 /bin/bash
|
68 |
```
|
69 |
|
70 |
+
2. Download the SDXL LCM TensorRT files from this repo
|
71 |
```shell
|
72 |
git lfs install
|
73 |
git clone https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt
|
|
|
78 |
|
79 |
3. Install libraries and requirements
|
80 |
```shell
|
|
|
|
|
|
|
81 |
cd demo/Diffusion
|
82 |
+
python3 -m pip install --upgrade pip
|
83 |
pip3 install -r requirements.txt
|
84 |
+
python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt
|
85 |
```
|
86 |
|
87 |
+
4. Perform TensorRT optimized inference for the sdxl
|
88 |
* The first invocation produces plan files in `engine_xl_base` and `engine_xl_refiner` specific to the accelerator being run on and are reused for later invocations.
|
89 |
|
90 |
```
|
|
|
98 |
--denoising-steps 30 \
|
99 |
--onnx-base-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-base \
|
100 |
--onnx-refiner-dir /workspace/stable-diffusion-xl-1.0-tensorrt/sdxl-1.0-refiner
|
101 |
+
```
|
102 |
+
|
103 |
+
4. Perform TensorRT optimized inference for the sdxl Latent Consistency Model (LCM) version
|
104 |
+
* The first invocation produces plan files in --engine-dir specific to the accelerator being run on and are reused for later invocations.
|
105 |
+
```
|
106 |
+
python3 demo_txt2img_xl.py \
|
107 |
+
""Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"" \
|
108 |
+
--version=xl-1.0 \
|
109 |
+
--onnx-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm \
|
110 |
+
--engine-dir /workspace/stable-diffusion-xl-1.0-tensorrt/lcm/engine-sdxl-lcm-nocfg \
|
111 |
+
--scheduler LCM \
|
112 |
+
--denoising-steps 4 \
|
113 |
+
--guidance-scale 0.0 \
|
114 |
+
--seed 42
|
115 |
+
|
116 |
```
|