steveant commited on
Commit
a40cc0f
·
verified ·
1 Parent(s): 85d8461

Upload 22 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ samples/1739028394882__000003200_9.jpg filter=lfs diff=lfs merge=lfs -text
37
+ samples/1739030252616__000003400_0.jpg filter=lfs diff=lfs merge=lfs -text
38
+ samples/1739030367779__000003400_1.jpg filter=lfs diff=lfs merge=lfs -text
39
+ samples/1739030480214__000003400_2.jpg filter=lfs diff=lfs merge=lfs -text
40
+ samples/1739030595357__000003400_3.jpg filter=lfs diff=lfs merge=lfs -text
41
+ samples/1739030822778__000003400_5.jpg filter=lfs diff=lfs merge=lfs -text
42
+ samples/1739030937356__000003400_6.jpg filter=lfs diff=lfs merge=lfs -text
43
+ samples/1739031049963__000003400_7.jpg filter=lfs diff=lfs merge=lfs -text
44
+ samples/1739031164540__000003400_8.jpg filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,271 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-to-image
4
+ - flux
5
+ - lora
6
+ - diffusers
7
+ - template:sd-lora
8
+ widget:
9
+ - text: 'A photorealistic business headshot of [token], a caucasian man in his 40s
10
+ exuding confidence in a modern office setting.'
11
+ output:
12
+ url: samples/1739030252616__000003400_0.jpg
13
+ - text: 'A professional portrait of [token], a caucasian man in his 40s and full head
14
+ of hair, dressed sharply in a navy suit, captured with soft natural light.'
15
+ output:
16
+ url: samples/1739030367779__000003400_1.jpg
17
+ - text: 'A high-quality headshot of [token], a caucasian man in his 40s and full head
18
+ of hair, taken with an 85mm lens, emphasizing realism and authority.'
19
+ output:
20
+ url: samples/1739030480214__000003400_2.jpg
21
+ - text: 'A corporate headshot of [token] , a caucasian man in his 40s, standing before
22
+ a sleek office backdrop with a confident expression.'
23
+ output:
24
+ url: samples/1739030595357__000003400_3.jpg
25
+ - text: 'An executive portrait of [token], a caucasian man in his 40s and full head
26
+ of hair, softly lit, showcasing subtle facial details and approachability.'
27
+ output:
28
+ url: samples/1739030707660__000003400_4.jpg
29
+ - text: 'A detailed business headshot of [token], a caucasian man in his 40s and full
30
+ head of hair, framed with a blurred modern office environment.'
31
+ output:
32
+ url: samples/1739030822778__000003400_5.jpg
33
+ - text: 'A professional close-up of [token], a caucasian man in his 40s and full head
34
+ of hair, using a shallow depth of field to highlight facial authenticity.'
35
+ output:
36
+ url: samples/1739030937356__000003400_6.jpg
37
+ - text: 'A crisp and polished portrait of [token], a caucasian man in his 40s and full
38
+ head of hair, captured in a well-lit professional setting.'
39
+ output:
40
+ url: samples/1739031049963__000003400_7.jpg
41
+ - text: 'A corporate-style image of [token], a caucasian man in his 40s and full head
42
+ of hair, with a slight smile, reinforcing trust and professionalism'
43
+ output:
44
+ url: samples/1739031164540__000003400_8.jpg
45
+ - text: 'A photorealistic shot of [token], a caucasian man in his 40s and full head
46
+ of hair, taken with a Nikon D850, emphasizing clarity and natural texture.'
47
+ output:
48
+ url: samples/1739031276857__000003400_9.jpg
49
+ base_model: black-forest-labs/FLUX.1-dev
50
+ instance_prompt: steve
51
+ license: other
52
+ license_name: flux-1-dev-non-commercial-license
53
+ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
54
+ ---
55
+
56
+ # steve_lora_flux_1_dev_v1.2
57
+
58
+ A LoRA-based Stable Diffusion model trained to generate images of a man named “Steve” in a wide variety of scenarios. This model is fine-tuned from **[black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)** using a Flow Matching–based noise scheduler.
59
+
60
+ <Gallery />
61
+
62
+ ## Trigger Words
63
+
64
+ Use `steve` in your prompt to activate the specific style and character details for this LoRA.
65
+
66
+ ## Model Information
67
+
68
+ - **LoRA Rank / Alpha**: 16 / 16
69
+ - **Number of Steps**: 4000
70
+ - **Batch Size**: 1
71
+ - **Learning Rate**: 0.0001
72
+ - **Noise Scheduler**: `flowmatch`
73
+ - **Optimizer**: `adamw8bit`
74
+ - **Precision**: `bf16`
75
+ - **Gradient Checkpointing**: true
76
+ - **EMA**: true (decay = 0.99)
77
+ - **Quantization**: enabled
78
+
79
+ ## How to Use
80
+
81
+ This LoRA can be merged or applied to the **FLUX.1-dev** base model through [Diffusers](https://github.com/huggingface/diffusers) or a compatible UI/tool.
82
+
83
+ Example pseudocode:
84
+
85
+ ```python
86
+ from diffusers import StableDiffusionPipeline
87
+ import torch
88
+
89
+ base_model = "black-forest-labs/FLUX.1-dev"
90
+ lora_model = "YOUR_USERNAME/steve_lora_flux_1_dev_v1.2"
91
+
92
+ pipe = StableDiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.float16).to("cuda")
93
+ # Load your LoRA weights (implementation depends on the UI or method)
94
+ # pipe.load_lora_weights(lora_model) # Example call
95
+
96
+ prompt = "steve, man lounging in fitted athletic wear on crisp white linens, strong and confident"
97
+ image = pipe(prompt).images[0]
98
+ image.save("steve_example.jpg")
99
+ ```
100
+
101
+ ## Download Model
102
+ Weights for this LoRA are available in Safetensors format.
103
+ Download them from the Files & versions tab.
104
+
105
+ ## License
106
+ This model is provided under a flux-1-dev-non-commercial-license. Please review the license file for details on acceptable use.
107
+
108
+ ## Acknowledgements
109
+ Trained with AI Toolkit by Ostris
110
+ Based on the FLUX.1-dev base model
111
+
112
+ ## Disclaimer:
113
+ Use responsibly. This model is intended for artistic, non-commercial purposes. The creators are not responsible for any misuse, generation of disallowed content, or potential harm caused by outputs. Always review and curate model outputs before sharing.
114
+
115
+
116
+ # steveant/steve-lora-v1.2
117
+
118
+ This is a [LoRA](https://arxiv.org/abs/2106.09685)-based Stable Diffusion model fine-tuned on a custom image dataset to generate images featuring a man named “Steve” in various settings and scenarios. It has been trained using the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) base model, leveraging a Flow Matching–based noise scheduler and LoRA network adapters.
119
+
120
+ > **Note:** This model is in version `v1.2` and is currently considered experimental.
121
+
122
+ ---
123
+
124
+ ## Model Details
125
+
126
+ - **Model type**: LoRA adapter for Stable Diffusion (`sd_trainer`)
127
+ - **Trigger word**: `steve`
128
+ - **Base model**: [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
129
+ - **Network**: LoRA (rank: 16, alpha: 16)
130
+ - **Quantization**: Enabled
131
+ - **Datasets**: Private dataset containing images and associated textual captions.
132
+
133
+ ### Model Architecture and Training
134
+
135
+ This LoRA was trained using the following key parameters:
136
+
137
+ - **Training steps**: `4000`
138
+ - **Batch size**: `1`
139
+ - **Gradient accumulation steps**: `1`
140
+ - **Learning rate**: `0.0001`
141
+ - **Noise Scheduler**: `flowmatch`
142
+ - **Optimizer**: `adamw8bit`
143
+ - **Precision**: `bf16`
144
+ - **LoRA settings**:
145
+ - Linear rank: `16`
146
+ - Linear alpha: `16`
147
+ - **Sampling configuration** (for sample images):
148
+ - Sampler: `flowmatch`
149
+ - Resolution: `1200 x 1600`
150
+ - Guidance scale: `4.1`
151
+ - Sample steps: `29`
152
+
153
+ During training, image captions were drawn from `.txt` files. Some techniques applied include:
154
+ - **Caption dropout**: `0.00`
155
+ - **Token shuffling**: `true`
156
+ - **Gradient checkpointing**: `true`
157
+ - **Exponential moving average**: `use_ema = true` with `ema_decay = 0.99`
158
+
159
+ ---
160
+
161
+ ## Intended Use
162
+
163
+ This model is intended to generate images of a “Steve” character in various poses, outfits, and scenarios. Possible use cases include:
164
+
165
+ - Creative media and content generation
166
+ - Character concepting for artistic projects
167
+ - Test and experimentation with Flow Matching–based schedulers in Stable Diffusion
168
+
169
+ > **Important**: This model is not intended to generate explicit or harmful content. Users are advised to comply with local regulations and handle outputs responsibly.
170
+
171
+ ---
172
+
173
+ ## How to Use
174
+
175
+ 1. **Installation**
176
+ Make sure you have the [Diffusers library](https://github.com/huggingface/diffusers) or another Stable Diffusion–compatible framework installed.
177
+
178
+ 2. **Loading the Model**
179
+ ```python
180
+ from diffusers import StableDiffusionPipeline
181
+ import torch
182
+
183
+ # Example: Pseudocode for loading the base model + LoRA
184
+ base_model_id = "black-forest-labs/FLUX.1-dev"
185
+ lora_model_id = "steveant/steve-lora-v1.2" # hypothetical path on HF hub
186
+
187
+ pipeline = StableDiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16).to("cuda")
188
+ # Load LoRA weights
189
+ # Typically, you would merge or apply the LoRA as per your chosen library's method.
190
+ ```
191
+
192
+ 3. **Prompting**
193
+ Use the **trigger word** `steve` in your prompt to invoke the specific style or character details. For instance:
194
+ ```python
195
+ prompt = (
196
+ "steve, man lounging in fitted athletic wear on crisp white linens, "
197
+ "strong and confident expression, warm ambient lighting, full-body shot, "
198
+ "textured fabric details"
199
+ )
200
+ result = pipeline(prompt).images[0]
201
+ result.save("steve_lounging.png")
202
+ ```
203
+
204
+ 4. **Negative Prompting (Optional)**
205
+ Provide a `neg` (negative) prompt parameter to omit or reduce undesired elements.
206
+ ```python
207
+ neg_prompt = "low resolution, bad quality"
208
+ result = pipeline(prompt=prompt, negative_prompt=neg_prompt).images[0]
209
+ ```
210
+
211
+ ---
212
+
213
+ ## Sample Prompts
214
+
215
+ Below are some sample prompts used during training:
216
+
217
+ - `A photorealistic business headshot of steve, a Caucasian man in his 40s exuding confidence in a modern office setting.`
218
+ - `A professional portrait of steve, a Caucasian man in his 40s and full head of hair, dressed sharply in a navy suit, captured with soft natural light.`
219
+ - `A high-quality headshot of steve, a Caucasian man in his 40s and full head of hair, taken with an 85mm lens, emphasizing realism and authority.`
220
+ - `A corporate headshot of steve, a Caucasian man in his 40s, standing before a sleek office backdrop with a confident expression.`
221
+ ---
222
+
223
+ ## Limitations and Biases
224
+
225
+ - The model’s outputs depend on the style and content of the dataset.
226
+ - Since the training data is limited to “Steve” images in specific scenarios, the model may not generalize well to drastically different contexts.
227
+ - **Bias**: Any biases in the original dataset might be reflected in the generated images.
228
+
229
+ ---
230
+
231
+ ## Training Data
232
+
233
+ - **Private dataset** of images featuring “Steve,” labeled with text captions.
234
+ - **Resolution** used for latent caching: `720`, `960`, and `1440`.
235
+ - **Data augmentation**: Slight caption dropout, token shuffling, etc.
236
+
237
+ ---
238
+
239
+ ## Citation
240
+
241
+ If you use this model or find it helpful for your research/projects, please cite:
242
+
243
+ ```
244
+ @misc{steve_lora_flux_1_dev_v1.2,
245
+ author = {steveant},
246
+ title = {steve_lora_flux_1_dev_v1.2 (LoRA model)},
247
+ year = {2024},
248
+ howpublished = {\url{https://huggingface.co/steveant/steve-lora-v1.2}},
249
+ }
250
+ ```
251
+
252
+ ---
253
+
254
+ ## License
255
+
256
+ This model and code are available under **CreativeML Open RAIL-M** or your chosen license. Please refer to the [repository’s license](./LICENSE) or contact the author for more details.
257
+
258
+ ---
259
+
260
+ ## Contributing
261
+
262
+ Contributions are welcome! If you wish to improve this model card or have new use cases and improvements to propose:
263
+
264
+ 1. Open an issue on the [GitHub/Spaces project](#) (if available).
265
+ 2. Submit pull requests or suggestions.
266
+ 3. Respect the usage and license guidelines.
267
+
268
+ ---
269
+
270
+ **Disclaimer**:
271
+ This model is for research and educational purposes. Always validate and review images generated to ensure they align with your intended use and do not violate any regulations or ethical standards.
config.yaml ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ job: extension
2
+ config:
3
+ name: steve_lora_flux_1_dev_v1.2
4
+ process:
5
+ - type: sd_trainer
6
+ training_folder: output
7
+ performance_log_every: 1000
8
+ device: cuda:0
9
+ trigger_word: steve
10
+ network:
11
+ type: lora
12
+ linear: 16
13
+ linear_alpha: 16
14
+ save:
15
+ dtype: float16
16
+ save_every: 200
17
+ max_step_saves_to_keep: 4
18
+ push_to_hub: false
19
+ datasets:
20
+ - folder_path: ../images/steve3-lora
21
+ caption_ext: txt
22
+ caption_dropout_rate: 0.0
23
+ shuffle_tokens: true
24
+ cache_latents_to_disk: true
25
+ resolution:
26
+ - 1200
27
+ train:
28
+ batch_size: 1
29
+ steps: 4000
30
+ gradient_accumulation_steps: 1
31
+ train_unet: true
32
+ train_text_encoder: false
33
+ content_or_style: balanced
34
+ gradient_checkpointing: true
35
+ noise_scheduler: flowmatch
36
+ optimizer: adamw8bit
37
+ lr: 0.0001
38
+ skip_first_sample: false
39
+ linear_timesteps: true
40
+ ema_config:
41
+ use_ema: true
42
+ ema_decay: 0.99
43
+ dtype: bf16
44
+ model:
45
+ name_or_path: black-forest-labs/FLUX.1-dev
46
+ is_flux: true
47
+ quantize: true
48
+ sample:
49
+ sampler: flowmatch
50
+ sample_every: 200
51
+ width: 1200
52
+ height: 1600
53
+ prompts:
54
+ - A photorealistic business headshot of [token], a caucasian man in his 40s
55
+ exuding confidence in a modern office setting.
56
+ - A professional portrait of [token], a caucasian man in his 40s and full head
57
+ of hair, dressed sharply in a navy suit, captured with soft natural light.
58
+ - A high-quality headshot of [token], a caucasian man in his 40s and full head
59
+ of hair, taken with an 85mm lens, emphasizing realism and authority.
60
+ - A corporate headshot of [token] , a caucasian man in his 40s, standing before
61
+ a sleek office backdrop with a confident expression.
62
+ - An executive portrait of [token], a caucasian man in his 40s and full head
63
+ of hair, softly lit, showcasing subtle facial details and approachability.
64
+ - A detailed business headshot of [token], a caucasian man in his 40s and full
65
+ head of hair, framed with a blurred modern office environment.
66
+ - A professional close-up of [token], a caucasian man in his 40s and full head
67
+ of hair, using a shallow depth of field to highlight facial authenticity.
68
+ - A crisp and polished portrait of [token], a caucasian man in his 40s and full
69
+ head of hair, captured in a well-lit professional setting.
70
+ - A corporate-style image of [token], a caucasian man in his 40s and full head
71
+ of hair, with a slight smile, reinforcing trust and professionalism.
72
+ - A photorealistic shot of [token], a caucasian man in his 40s and full head
73
+ of hair, taken with a Nikon D850, emphasizing clarity and natural texture.
74
+ neg: ''
75
+ seed: 42
76
+ walk_seed: true
77
+ lora_scale: 1.3
78
+ guidance_scale: 4.1
79
+ sample_steps: 29
80
+ meta:
81
+ name: steve_lora_flux_1_dev_v1.2
82
+ version: '1.2'
samples/1739028394882__000003200_9.jpg ADDED

Git LFS Details

  • SHA256: 88c7f52fd9dda32cda97b519a5ae232f24935668f546071429e28bc1d0dc47ca
  • Pointer size: 131 Bytes
  • Size of remote file: 112 kB
samples/1739028394882__000003200_9.jpgZone.Identifier ADDED
File without changes
samples/1739030252616__000003400_0.jpg ADDED

Git LFS Details

  • SHA256: 95c1801771db9d82452da1a5d3b9dbd6d912f9495757902afdc22505f63a9c34
  • Pointer size: 131 Bytes
  • Size of remote file: 134 kB
samples/1739030252616__000003400_0.jpgZone.Identifier ADDED
File without changes
samples/1739030367779__000003400_1.jpg ADDED

Git LFS Details

  • SHA256: 2fbdc97c50ca79207c614d59d620a7f98b3ec625da99be08932ac5eb04a431d1
  • Pointer size: 131 Bytes
  • Size of remote file: 125 kB
samples/1739030367779__000003400_1.jpgZone.Identifier ADDED
File without changes
samples/1739030480214__000003400_2.jpg ADDED

Git LFS Details

  • SHA256: d1c7f6e62a91458542b66ab30ed964325e967c481485e7716e8128c295a9333e
  • Pointer size: 131 Bytes
  • Size of remote file: 123 kB
samples/1739030480214__000003400_2.jpgZone.Identifier ADDED
File without changes
samples/1739030595357__000003400_3.jpg ADDED

Git LFS Details

  • SHA256: 9878c0a991d70379d4b21f949f23a2129d02ee697bfa1df343f5c603577c8624
  • Pointer size: 131 Bytes
  • Size of remote file: 134 kB
samples/1739030595357__000003400_3.jpgZone.Identifier ADDED
File without changes
samples/1739030707660__000003400_4.jpg ADDED
samples/1739030707660__000003400_4.jpgZone.Identifier ADDED
File without changes
samples/1739030822778__000003400_5.jpg ADDED

Git LFS Details

  • SHA256: 06787bd0f80e18fe727d41b34c68ad32cd54da8a3b816fb6f969b9153a5098db
  • Pointer size: 131 Bytes
  • Size of remote file: 118 kB
samples/1739030822778__000003400_5.jpgZone.Identifier ADDED
File without changes
samples/1739030937356__000003400_6.jpg ADDED

Git LFS Details

  • SHA256: 5180902160763c1209aceef0b8419e54ce45db3bf953eb7ea3f1ec55cb272724
  • Pointer size: 131 Bytes
  • Size of remote file: 128 kB
samples/1739030937356__000003400_6.jpgZone.Identifier ADDED
File without changes
samples/1739031049963__000003400_7.jpg ADDED

Git LFS Details

  • SHA256: 3f9d620c6ff40dae93af055373ac8d6f0d09b133a258ec2a13b934985c80d693
  • Pointer size: 131 Bytes
  • Size of remote file: 118 kB
samples/1739031049963__000003400_7.jpgZone.Identifier ADDED
File without changes
samples/1739031164540__000003400_8.jpg ADDED

Git LFS Details

  • SHA256: cb4c71509fe5ca142a2862b2828008a3e062366f1e6350ae90a4ff76ac5f3d82
  • Pointer size: 131 Bytes
  • Size of remote file: 119 kB
samples/1739031164540__000003400_8.jpgZone.Identifier ADDED
File without changes