Spaces:
Running
on
Zero
Additon of new CFG Methods
Hello!
Is it possible to get this supported?
https://huggingface.co./docs/diffusers/main/en/using-diffusers/pag (should have an enable/disable button)
And cfg rescale (rescale classifier-free guidance which I believe is labeled as guidance_rescale
in diffusers) (should have an enable/disable button)
Also is it possible to implement CFG++ Samplers?
PAG is possible. Or rather, stablepy supports it, but I just haven't made a GUI for it. I'm just being lazy.π
I wonder what the rest of it is like...?
For better or worse, it's abstracted using stablepy, so it'll probably break if it's not supported by stablepy.
DiffuseCraft is a demo for stablepy, so most of the things that stablepy itself can do have been implemented. So, if it's a feature that DiffuseCraft has, I can add it straight away. If it's a feature that DiffuseCraft doesn't have, but is supported by the pip version of Diffusers, r3gm will probably support it quite quickly, like he did with the scheduler the other day.
He is a busy but friendly and proactive person, so he will probably answer your questions unless they are unreasonable requests in a programmatic context.
If it's something really simple, I can even submit a PR.
https://huggingface.co./spaces/r3gm/DiffuseCraft/discussions?status=open&type=discussion
https://github.com/R3gm/stablepy
Edit:
Since we're here, let's identify the features that DiffuseCraft (stablepy) is missing. It would be easier if we put together a list of ideas. He writes 100 times faster and more accurately than I do...
This is purely a difference in coding ability...
Edit:
Also, tell me which features you want to see prioritized that are not yet in VP, but are in DiffuseCraft. I'll add PAG first.
I added PAG scale and FreeU.
I added PAG scale and FreeU.
Thanks!
Since we're here, let's identify the features that DiffuseCraft (stablepy) is missing. It would be easier if we put together a list of ideas. He writes 100 times faster and more accurately than I do...
This is purely a difference in coding ability...
There is a kind of scheduler/samplers that are very promising, I'd love to see them supported in stablepy. They're called CFG++ or cfgpp in comfyui, https://arxiv.org/abs/2406.08070
Aside from this I think stablepy is feature complete till now.
I also believe that img2img in VP (supported by diffuse craft) would be very good, controlnets and IPAdapters would be great too !
Also inference with lycoris!
I can't use LoKrs or LoHAs on VP or DiffuseCraft. there are alot of LyCORISs out there that are better than LoRAs but I'm unable to use them since they're not supported in stablepy (neither are they in diffusers) so this'll probably be a great addition
Thanks.
I've also been wondering about LyCORIS, and I can use it if I call it directly from PEFT, but as you say, there is no way to call it from Diffusers. I think r3gm has the ability to create functions that are not in Diffusers, but I think the wrapping of Diffusers is the theme of stablepy, so it would be better to improve Diffusers itself first...
The issue with the LyCORIS implementation is how to determine that it is a LyCORIS file. I think this is also true for other LoRA variants.
Is CFG++ supported by Diffusers?
Edit:
Oh... its similar rescale.
Typically, LyCoris algorithms/models contain an identifier called "hada" in the keys which I believe stands for "hadamard product"
https://github.com/KohakuBlueleaf/LyCORIS/blob/main/docs/Algo-Details.md
This is also discovered here by sayakpaul
https://github.com/huggingface/diffusers/issues/4133
There was also an issue about this here https://github.com/huggingface/diffusers/issues/3087
I also found this gist about lycoris inference
https://gist.github.com/adhikjoshi/2c6da89cbcd7a6a3344d3081ccd1dda0
Is CFG++ supported by Diffusers?
I believe not.
Tho reForge webui implemented it
https://github.com/Panchovix/stable-diffusion-webui-reForge
I have informed r3gm about guidance_rescale. Good night.πͺ
This is also discovered here by sayakpaul
There may be some issues with sayakpaul knowing about it but not having it implemented...
It looks like it still hasn't been implemented even with this major LoRA renovation.
Edit:
PEFT's one.
https://huggingface.co./docs/peft/package_reference/adapter_utils
Edit:
CFG++ is not easy even in Forge (nor reForge)?
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1864
CFG++ is not easy even in Forge (nor reForge)
CFG++ is implemented in reforge.
https://github.com/Panchovix/stable-diffusion-webui-reForge/blob/main/modules%2Fsd_samplers_extra.py#L77
I have informed r3gm about guidance_rescale.
Great!
There may be some issues with sayakpaul knowing about it but not having it implemented...
I thought so too.
It looks like it still hasn't been implemented even with this major LoRA renovation.
I'm also unsure why it hasn't been implemented
CFG++ is implemented in reforge.
yea.
@xi0v
@r3gm
https://github.com/huggingface/diffusers/issues/4133#issuecomment-2510141183
I was told that LyCORIS should already be supported. I'm sure there is some LyCORIS that will cause changes when applied, but I can't seem to find any logic that reads like a header...I wonder if PEFT is taking over that part of the process.
It would be nice if there was a LyCORIS that would produce the problem.
Hmmm
I suggest we try to find a LoHA Or a LoKr.
I believe the way to distinguish between a Lora and a LoHA should be the hada
identifier in the lyco keys.
Otherwise, if diffusers has implemented a way to automatically load lycoris as if you were loading a regular lora, it should work as is.
Taken from https://github.com/KohakuBlueleaf/LyCORIS/blob/main/lycoris%2Fmodules%2Floha.py#L10-L29
class LohaModule(LycorisBaseModule):
name = "loha"
support_module = {
"linear",
"conv1d",
"conv2d",
"conv3d",
}
weight_list = [
"hada_w1_a",
"hada_w1_b",
"hada_w2_a",
"hada_w2_b",
"hada_t1",
"hada_t2",
"alpha",
"dora_scale",
]
weight_list_det = ["hada_w1_a"]
as for LoKr, the identifier should be lokr
Taken from https://github.com/KohakuBlueleaf/LyCORIS/blob/main/lycoris%2Fmodules%2Flokr.py#L23-L43
class LokrModule(LycorisBaseModule):
name = "kron"
support_module = {
"linear",
"conv1d",
"conv2d",
"conv3d",
}
weight_list = [
"lokr_w1",
"lokr_w1_a",
"lokr_w1_b",
"lokr_w2",
"lokr_w2_a",
"lokr_w2_b",
"lokr_t1",
"lokr_t2",
"alpha",
"dora_scale",
]
weight_list_det = ["lokr_w1", "lokr_w1_a"]
So a hacky way would be to dump lyco keys each time a lyco is being loaded and trying to find the identifier, tho this is definitely not the best way ever.
I see. There are some differences even within LyCORIS? If I can find one or two LyCORIS files that don't actually work, I can raise an issue, so I'll try to find some later. If they all do work, then that solves the problem.
I see. There are some differences even within LyCORIS
Yup.
If I can find one or two LyCORIS files that don't actually work
Time to browse civit π
I'm currently working on improving the pipeline code, so I'll do that first. Before that, I still have to finish my daily routine...
Hopefully I'm not out of place dropping in on this, but I have a Flux LoKr you could test out here: https://huggingface.co./davidrd123/Mary-Cassatt-Oil-FullAndCrops-Phase-4-beta_2-4_ss0_7-Flux-LoKr
Thank you! You're not wrong. However, as I understand it at the moment, LyCORIS requires assistance from external libraries in order to operate fully. It is possible that this is because it is LoKr.
Incidentally, the error that occurs when you perform inference using the GUI there is another unresolved bug in HF.π
from diffusers import DiffusionPipeline
from lycoris import create_lycoris_from_weights
Edit:
I spent too much time on the pipeline and couldn't get to LyCORIS today, so I'll try writing the code tomorrow...
Can we get IPAdapters (and maybe controlnets) in VP?
@John6666
I believe diffusecraft supports both.
With the ability to download custom ones, like the NoobAi ones
With the ability to download custom ones, like the NoobAi ones
I think this is the part that stablepy is in charge of. I'll read the code again and ask r3gm if that's the case. Other than that, I just need to make a GUI equivalent to DiffuseCraft. However, I might prioritize LyCORIS or something else for today and tomorrow.
Edit:
Well, it might be a bit of a high hurdle, but it might be quicker to make a direct request to him. Requests from actual users are valuable to the author. As long as he doesn't get overwhelmed by too many requests.
Also, the author can provide support with more time to spare if he finds out about it sooner. I was also wondering about the separate ControlNet for NoobAI.
https://huggingface.co./posts/nyuuzyou/820726264775936
This is a bad situation. It may be necessary to delete more than 90% of the models.
I can't be doing any programming for a while.
Even if I'm pretty much a goofball, I don't think it was there until yesterday before I went to bed...π€’
Edit:
I made a download script to evacuate models locally. It's not possible for me to do it in the amount I have...
HF_TOKEN = "hf_******"
import os
import shutil
from pathlib import Path
from huggingface_hub import snapshot_download, HfApi
def download_repos(repos: list[str], token: str=False):
try:
api = HfApi(token=token)
success = []
fail = []
exists = []
for repo in repos:
if api.repo_exists(repo_id=repo, repo_type="model", token=token): repo_type = "model"
elif api.repo_exists(repo_id=repo, repo_type="dataset", token=token): repo_type = "dataset"
elif api.repo_exists(repo_id=repo, repo_type="space", token=token): repo_type = "space"
else: continue
username = repo.split("/")[0]
reponame = repo.split("/")[-1]
if repo_type == "model": local_dir = f"{username}/{reponame}"
elif repo_type == "dataset": local_dir = f"{username}/datasets/{reponame}"
elif repo_type == "space": local_dir = f"{username}/spaces/{reponame}"
if Path(local_dir).exists() and Path(local_dir).is_dir():
print(f"{repo} is already exists")
exists.append(repo)
continue
os.makedirs(local_dir, exist_ok=True)
print(f"Downloading {repo}")
try:
snapshot_download(repo_id=repo, repo_type=repo_type, local_dir=local_dir, token=token)
success.append(repo)
except Exception as e:
print(f"Download failed: {repo} {e}")
shutil.rmtree(local_dir)
fail.append(repo)
continue
downloaded = '\n'.join(success)
failed = '\n'.join(fail)
exist = '\n'.join(exists)
print(f"Downloaded:\n{downloaded}")
print(f"Already Exists:\n{exist}")
print(f"Download failed:\n{failed}")
except Exception as e:
print(f"Download error: {e}")
def download_all_repos(user: str, token: str=False):
api = HfApi(token=token)
models = [i.id for i in api.list_models(author=user, token=token)]
datasets = [i.id for i in api.list_datasets(author=user, token=token)]
spaces = [i.id for i in api.list_spaces(author=user, token=token)]
download_repos(models + datasets + spaces)
def download_all_spaces(user: str, token: str=False):
api = HfApi(token=token)
spaces = [i.id for i in api.list_spaces(author=user, token=token)]
download_repos(spaces)
download_all_repos("John6666", HF_TOKEN)
I'm not even sure what these limits are supposed to do, am I no longer allowed to upload models or what?
Uploads are possible. The old model hasn't been deleted or anything. The current explanation is that βthe bar is just displayed nowβ, but this is not an explanation...
As usual, there is no announcement...π
From the HF staff's post on Reddit, it doesn't seem like a big deal, and it looks like a countermeasure against random vandalism...
I see. I guess we'll have to wait for a formal explanation/blog post from HF
i agree.
btw. there are a few features that diffusecraft supports which aren't in VP, such as hires after image generation and IPAdapters.
Can we get those in VP?
Of course. I just make the GUI. It's just that this situation is a bit... difficult to move around.π
@xi0v
https://discuss.huggingface.co/t/information-about-the-disk-usage-quota-for-hugging-face-users-established-in-december-2024/129521
Now that things have calmed down a bit, I've started working on updating DC and VP.
To my surprise, in addition to the guidance_rescale, there is a change that allows us to select ControlNet as standard in DiffuseCraft!
I've almost finished with DC, but I'm still thinking about what to do with the interface for VP.
To be honest, I'm just going to place a lot of Gradio components and pass them as arguments, but there are so many!π
If you have any ideas about where to place components, please let me know.
I think I'll end up using Accordion or tabs or tabs within tabs.
Now that things have calmed down a bit, I've started working on updating DC and VP.
let's go! Now that we got confirmation from HF that no models are going to get deleted, everything should be good to go.
To my surprise, in addition to the guidance_rescale, there is a change that allows you to select ControlNet as standard in DiffuseCraft!
Interesting!
If you have any ideas about where to place components, please let me know.
I think I'll end up using Accordion or tabs or tabs within tabs.
I'd say tabs within tabs would be perfect.
tabs within tabs would be perfect.
That idea has been adopted. The VP will probably be completed tomorrow. It's late at night.
DC is still being debugged, but it can be released immediately if we want to.
Edit:
Since the VP modification is likely to be delayed, I've released the DC Mod for now.
The reason is trivial: there are more items than I expected, and as always, the Gradio image-related components are buggy in a subtle way, so I'm not sure how to use them.
That idea has been adopted. The VP will probably be completed tomorrow. It's late at night.
Great!
DC is still being debugged, but it can be released immediately if we want to.
I see.
Since the VP modification is likely to be delayed, I've released the DC Mod for now.
Will check it out!
there are more items than I expected, and as always, the Gradio image-related components are buggy in a subtle way, so I'm not sure how to use them.
I see, take your time!
I was busy getting used to the Discord environment for a while, so I didn't set up my own Spaces.π₯Ά
I'm going to use this disaster as an opportunity to take a general look around at the Spaces of myself and people I know, and do some maintenance.
Of course, I'll be doing some modifications to VP first.
If you have any ideas for things you'd like to see, or if you find a layout easier to use, or if something is out of date, or if something is broken, please feel free to write to me about it. Whether I do anything about it or not, it will still be useful as a reference. At the very worst, I can at least submit an issue to prompt the author to make a fix.
I was busy getting used to the Discord environment for a while, so I didn't set up my own Spaces.π₯Ά
Cool! What is your username? So that (If you want) I can dm you any weird behavior/issues in VP/any other space.
I'll be doing some modifications to VP first.
Great!
ideas for things you'd like to see
Hmmmmmmmmmmm
Off the top of my head, what I want to see in a diffusion related space is the following;
- Controlnet support
- IP adapter support + loading custom ones (like the NoobAi IPA)
- CFG++ or any of the modified/Custom Samplers that comfy supports
- rescale cfg
- pag scale (already in VP)
- hires after image generation
john6666cat
I was nuked by stepping on a mine in my main account. I think it will heal eventually. Just letting you know for now.π₯Ά
I was nuked by stepping on a mine in my main account. I think it will heal eventually. Just letting you know for now.π₯Ά
Wait, why did that happen?
I don't know for sure, but I think that the Spaces-related restrictions are malfunctioning.π
I think that HF won't know the cause until the HF server engineer comes to work on Monday morning in America...
There is nothing like a history, it's just that Spaces, which seems to not be working, was duplicated. At that moment, the account was forcibly logged out and it became impossible to log in...
If I had to say, the only distinguishing feature is that it is written in JavaScript in a Docker space, not Gradio and Python.
You can follow everything from the link below, but in any case, do not click the link while you are logged in with your main account!
It is not necessarily activated at the moment of duplication.
https://discuss.huggingface.co/t/cant-log-in-to-my-account/126966
Have you tried emailing HF or making a forum post?
Yes. And via Discord. While investigating this issue on Discord, HF staff member Adam fell into the same trap as me and was locked out...π₯Ά I understood the situation more or less, but in short, it was a complicated server-side malfunction. We decided to wait until Monday (US time).
Are you already unlocked? I see your space already up and running, nice space, by the way.
@gonegirl Thank you! I noticed that it was unlocked by your post. Before I knew it...π
I'll check the situation after I've eaten...
Great! I tried out your space, and it looks like DPM++ 3M SDE Karras creates some pretty bad ghosting image in realistic model. DPM++ 2M SDE Karras works fine, though. Not sure why that is. Also, the width and height are unfortunately capped at 1216. Thanks!
Anyway, I tried to post a comment, and itβs rate-limited for 12 minutes! lol. It looks like HF needs a bunch of GPUs just to post a comment :D
It's highly likely that the problem with the scheduler is with Diffusers itself...
I'll talk to r3gm about this at some point.
capped at 1216
I just forgot about it after setting it up the first time.π
I'll fix it in the next update. I don't know if it'll work, but I'll leave the setting there.
It seems like the build of Spaces using this account is not 100% yet. I'll do some testing.
Anyway, I tried to post a comment, and itβs rate-limited for 12 minutes!
I got hit with it too, using a sub-account!π There's been a massive amount of vandalism, so it can't be helped.
@xi0v I've added almost all of the features from Diffuse Craft. There are so many of them that I think there is a possibility of a bug (a mistake in passing arguments), but it seems that Inpaint works.π
And I've increased the resolution limit, but I don't know if it will work.
Great work! You added everything I want from diffusecraft.
Though I have a few tips that may interest you:-
Generation Settings and Model & Prompt should be grouped into one tab (as they used to be) for ease of access.
Textual inversion isn't as widely used as it were, so it's safe to remove it to sort of debloat the UI
Style Presets and Quality Tag Presets should be grouped into an accordion or a tab to make the UI Look more pleasing.
Try to expirement with adding hiresfix and detailfix into one tab called "detailfix / hiresfix"
Also allow usage of custom IP adapters such as noobai's IP adapter. if you can, please test it out and see if it works
@r3gm
apologies for the mention!
I wanted to ask, can we have either of these 2 samplers/schedulers implemented in stablepy? (If you are available ofcourse!)
CFG++ Scheduler,
Paper: https://arxiv.org/html/2406.08070v1
Implementation: https://github.com/CFGpp-diffusion/CFGpp
Restart Sampler,
Paper: https://arxiv.org/abs/2306.14878
Implementation: https://github.com/newbeeer/diffusion_restart_sampling
few tips
Thank you! I generally applied it.
custom IP adapters such as noobai's IP adapter
Only this is the part that stablepy
takes care of.
CFG++ Scheduler
This is more of a feature that should be implemented on the Diffusers side. However, adding a scheduler is a big deal...
There are several schedulers on Diffusers that have been ported from k_diffusion, which is also the basis for reForge's CFG++ implementation, so perhaps modifications of them might be able to handle this issue.
Implementation of cfgpp in reForge (modified from k_diffusion)
https://github.com/Panchovix/stable-diffusion-webui-reForge/blob/main/modules%2Fsd_samplers_extra.py#L77
https://github.com/Panchovix/stable-diffusion-webui-reForge/blob/main/ldm_patched%2Fk_diffusion%2Fsampling.py#L2167
By the way, r3gm is a busy person, so if you have something to ask him, it's probably more reliable to go to his repo's community discussion or his github issue.
The following has effectively become a chat space.π
https://huggingface.co./spaces/r3gm/DiffuseCraft/discussions/4
Select a task from the dropdown menu? The one with default option txt2img, then run, i've only tried img2img, though.
Is there any documented way of getting either of those to work
I think there isn't . If I had to say, it's an official DiffuseCraft sample?
There's a need for it to be easy to understand.π
Select a task from the dropdown menu?
That's correct. If I made it gr.Radio, there would be too many...
However, it's difficult to understand just by looking at it, so I'll think of some supplementary means.
Also, the adetailer section is too cramped in portrait mode
Thanks for the report. I didn't know that nesting Row() would cause Gradio to behave like that... I've fixed it. I think.
Is the adetailer detector hardcoded?
Rather than hard-coding, I'm throwing it into a library called stablepy
.π
I think you probably want to be able to select the detection model to use in ADetailer, or something like that.
The author of stablepy
is r3gm
, who is active on HF and github. He's busy, but he'll respond to decent requests.
There's also the option of us just committing to github.
https://huggingface.co./r3gm
https://huggingface.co./spaces/r3gm/DiffuseCraft/discussions
https://github.com/R3gm/stablepy/issues
@r3gm apologies for the mention!
I wanted to ask, can we have either of these 2 samplers/schedulers implemented in stablepy? (If you are available ofcourse!)CFG++ Scheduler,
Paper: https://arxiv.org/html/2406.08070v1
Implementation: https://github.com/CFGpp-diffusion/CFGpp
Restart Sampler,
Paper: https://arxiv.org/abs/2306.14878
Implementation: https://github.com/newbeeer/diffusion_restart_sampling
Hi
Some samplers employ an implementation that is incompatible with the Diffusers pipeline implementation.
This would require significant modifications to the __call__
method. Therefore, they will not be added at this time.
However, with the modular Diffusers PR, this integration will become feasible in the future.
Thank you.
Some samplers employ an implementation that is incompatible with the Diffusers pipeline implementation.
Wow... Is that actually working in its current state?
a button to send the generated image to the inpaint mask maker?
Thanks for the request. I've implemented it. If we have an idea for a feature like that, we can implement it in a few lines of code.π
Thank you.
Some samplers employ an implementation that is incompatible with the Diffusers pipeline implementation.
Wow... Is that actually working in its current state?
With Restart Sampler some changes are needed in the pipeline https://github.com/Newbeeer/diffusion_restart_sampling/blob/8cbcb076330381216b2f60578bb6e381ce182683/diffuser/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L676
While these changes can be made, given that there are many different pipelines for various tasks, it would be better to explore an alternative solution, like modular Diffusers
given that there are many different pipelines for various tasks, it would be better to explore an alternative solution, like modular Diffusers
I see. There are basically a total of at least six pipelines, the normal pipeline, Img2Img, Inpainting, and their respective ControlNet. That is multiplied by the number of model architectures.π±
Yeah. Let's wait for Modular.
Hi! I opened your space today, and itβs generating images very slowly, often taking several minutes before the image appears. Usually, it only takes about 20 seconds. Last time, the counter reached 700.0s and displayed big red [error].
I don't think it's a quota issue? I haven't used it in the last two days, and the other spaces are working fine as well.
Thank you. I've also received reports today that the display of the window is slow in other spaces. I haven't made any changes on this side today...
I'll try restarting it.
Maybe it's due to RAM usage? Is clearing the least used model supported?
It's possible.
Is clearing the least used model supported?
A certain amount of memory is freed up by default, but it may not be enough.
I turned off stablepy's cache option for now.
It slows down a lot if RAM usage is maxed out.
RAM
69/69 GB
Is it possible to clean up zeroGPU RAM as well?
I understand VRAM consuming, but I wonder what is consuming the RAM...π
For now, I've added gc.collect()
. It's a spell. Goodnight.
How does it work? Is 69 the sum of CPU RAM and GPU RAM?
maybe CPU RAM. VRAM of zero is 40GB.πͺ
gc is just a garbage collector.
Oh, okay. Good night!
@John6666
hello! I'm unable to inference SDXL models that are trained with flow matching, can you get r3gm to take a look at this and maybe see what he can do?
Here is the issue in reforge;
https://github.com/Panchovix/stable-diffusion-webui-reForge/issues/210
Fixed with this commit;
https://github.com/Panchovix/stable-diffusion-webui-reForge/commit/13a88f281f466906f09d9bbd955a3f285ddb31e3
On a side note, here is the result that each discrete sampling mode outputs; automatic and epsilon produce a fully burned image and v prediction produces somewhat coherent result but the image is so oversaturated and has a lot of noise
Merry Christmas.
I actually know that model, but I think it's not half SDXL anymore. Specifically, I think it won't work unless we modify the Diffusers pipeline.π€
@r3gm
Merry Christmas! Also, congratulations on the new stable release of stablepy.
I think this is the kind of thing that would be better off waiting for Community Pipeline or release of the Modular one. What do you think?
https://huggingface.co./nyanko7/nyaflow-xl-alpha
https://huggingface.co./spaces/nyanko7/toaru-xl-model
Extra: Other SDXL enhancement plans by civilians. I think the 8GB pony will work perfectly with just the DiffuseCraft modification. Maybe we just need to add a mode to the GUI where the load is not fixed at torch_dtype=torch.float16.
https://huggingface.co./nyanko7/sdxl_smoothed_energy_guidance
https://civitai.com/models/1051705/ultrareal-8gb-pony
not half SDXL anymore
It's still an SDXL model, the difference is the sampling. like the vpred models, they're still SDXL models.
Infact I did some merges with it and they should work.
Merry Christmas.
I actually know that model, but I think it's not half SDXL anymore. Specifically, I think it won't work unless we modify the Diffusers pipeline.π€@r3gm Merry Christmas! Also, congratulations on the new stable release of stablepy.
I think this is the kind of thing that would be better off waiting for Community Pipeline or release of the Modular one. What do you think?
https://huggingface.co./nyanko7/nyaflow-xl-alpha
https://huggingface.co./spaces/nyanko7/toaru-xl-model
Yes, it needs changes as you said
Extra: Other SDXL enhancement plans by civilians. I think the 8GB pony will work perfectly with just the DiffuseCraft modification. Maybe we just need to add a mode to the GUI where the load is not fixed at torch_dtype=torch.float16.
https://huggingface.co./nyanko7/sdxl_smoothed_energy_guidance
https://civitai.com/models/1051705/ultrareal-8gb-pony
An easy way to do it is to use: model.pipe.text_encoder.to(torch.float32)
model.pipe.text_encoder_2.to(torch.float32)
But it might be a good idea to have a parameter in stablepy for that.
I think that a major renovation for the scheduler related to FlowMatch is still underway, so I think we'll just have to wait and see for a while. Both of these have a significant impact on this part.π
As for the behavior of fp32 CLIP, I'm leaving the combination of fp32 and bf16 during conversion so as not to damage the upsampled data, so I think that in the case of Diffusers and Transformers, if torch_dtype= is not done, it will probably work as expected without being .to()ed.
If the current implementation of stablepy is passing torch_dtype= as is, this is probably the job of the UI side.
Or even if it's CLIP saved in fp16 precision, is there any significant benefit to just casting it to fp32 precision during calculations...?
This depends on the calculation error during calculations, so we won't know until we do some experiments (the difference will probably be minimal), but if there is a benefit, it might be worth including it as an option in stablepy. After all, VRAM consumption only changes by about 2GB.
If there isn't much benefit, fp16 is fine.
https://huggingface.co./John6666/ultrareal-8gb-pony-v2hybrid-sdxl
I think that a major renovation for the scheduler related to FlowMatch is still underway, so I think we'll just have to wait and see for a while. Both of these have a significant impact on this part.π
As for the behavior of fp32 CLIP, I'm leaving the combination of fp32 and bf16 during conversion so as not to damage the upsampled data, so I think that in the case of Diffusers and Transformers, if torch_dtype= is not done, it will probably work as expected without being .to()ed.
If the current implementation of stablepy is passing torch_dtype= as is, this is probably the job of the UI side.Or even if it's CLIP saved in fp16 precision, is there any significant benefit to just casting it to fp32 precision during calculations...?
This depends on the calculation error during calculations, so we won't know until we do some experiments (the difference will probably be minimal), but if there is a benefit, it might be worth including it as an option in stablepy. After all, VRAM consumption only changes by about 2GB.
If there isn't much benefit, fp16 is fine.
https://huggingface.co./John6666/ultrareal-8gb-pony-v2hybrid-sdxl
In my tests, I didn't notice much difference, but maybe I need to load the components separately to prevent them from being affected when I use torch.float16
Thanks for the verification!
So, it seems that in models with downcasted CLIP to fp16 once, which is the case for over 95% of models, it has no effect. It's only useful if the model was trained, saved and released with fp32 precision, or if it was upsampled.
Well, the good news is that there are no problems with the precision of CLIP calculations in fp16.
Anyway, it's not worth adding an option for.