h94 commited on
Commit
5421281
1 Parent(s): de102ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md CHANGED
@@ -26,8 +26,20 @@ An experimental version of IP-Adapter-FaceID: we use face ID embedding from a fa
26
 
27
  ![results](./ip-adapter-faceid.jpg)
28
 
 
 
 
 
 
 
 
 
 
 
29
  ## Usage
30
 
 
 
31
  Firstly, you should use [insightface](https://github.com/deepinsight/insightface) to extract face ID embedding:
32
 
33
  ```python
@@ -92,6 +104,75 @@ images = ip_model.generate(
92
 
93
  ```
94
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ## Limitations and Bias
97
  - The model does not achieve perfect photorealism and ID consistency.
 
26
 
27
  ![results](./ip-adapter-faceid.jpg)
28
 
29
+
30
+ **Update 2023/12/27**:
31
+
32
+ IP-Adapter-FaceID-Plus: face ID embedding (for face ID) + CLIP image embedding (for face structure)
33
+
34
+ <div align="center">
35
+
36
+ ![results](./faceid-plus.jpg)
37
+ </div>
38
+
39
  ## Usage
40
 
41
+ ### IP-Adapter-FaceID
42
+
43
  Firstly, you should use [insightface](https://github.com/deepinsight/insightface) to extract face ID embedding:
44
 
45
  ```python
 
104
 
105
  ```
106
 
107
+ ### IP-Adapter-FaceID-Plus
108
+
109
+ Firstly, you should use [insightface](https://github.com/deepinsight/insightface) to extract face ID embedding and face image:
110
+
111
+ ```python
112
+
113
+ import cv2
114
+ from insightface.app import FaceAnalysis
115
+ from insightface.utils import face_align
116
+
117
+
118
+ app = FaceAnalysis(name="buffalo_l", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
119
+ app.prepare(ctx_id=0, det_size=(640, 640))
120
+
121
+ image = cv2.imread("person.jpg")
122
+ faces = app.get(image)
123
+
124
+ faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)
125
+ face_image = face_align.norm_crop(image, landmark=faces[0].kps, image_size=224) # you can also segment the face
126
+ ```
127
+
128
+ Then, you can generate images conditioned on the face embeddings:
129
+
130
+ ```python
131
+
132
+ import torch
133
+ from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL
134
+ from PIL import Image
135
+
136
+ from ip_adapter.ip_adapter_faceid import IPAdapterFaceIDPlus
137
+
138
+ base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
139
+ vae_model_path = "stabilityai/sd-vae-ft-mse"
140
+ image_encoder_path = "h94/IP-Adapter/models/image_encoder"
141
+ ip_ckpt = "ip-adapter-faceid-plus_sd15.bin"
142
+ device = "cuda"
143
+
144
+ noise_scheduler = DDIMScheduler(
145
+ num_train_timesteps=1000,
146
+ beta_start=0.00085,
147
+ beta_end=0.012,
148
+ beta_schedule="scaled_linear",
149
+ clip_sample=False,
150
+ set_alpha_to_one=False,
151
+ steps_offset=1,
152
+ )
153
+ vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
154
+ pipe = StableDiffusionPipeline.from_pretrained(
155
+ base_model_path,
156
+ torch_dtype=torch.float16,
157
+ scheduler=noise_scheduler,
158
+ vae=vae,
159
+ feature_extractor=None,
160
+ safety_checker=None
161
+ )
162
+
163
+ # load ip-adapter
164
+ ip_model = IPAdapterFaceIDPlus(pipe, image_encoder_path, ip_ckpt, device)
165
+
166
+ # generate image
167
+ prompt = "photo of a woman in red dress in a garden"
168
+ negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality, blurry"
169
+
170
+ images = ip_model.generate(
171
+ prompt=prompt, negative_prompt=negative_prompt, face_image=face_image, faceid_embeds=faceid_embeds, num_samples=4, width=512, height=768, num_inference_steps=30, seed=2023
172
+ )
173
+
174
+ ```
175
+
176
 
177
  ## Limitations and Bias
178
  - The model does not achieve perfect photorealism and ID consistency.