[questions on that fantastic tool] (LoRa's, Model merge, Weight map, Sample Size)

#1
by FashionStash - opened

Hi there @John6666 !!

First, thanks a lot for making that fantastic tool public !
It will be really useful since I need more knowledge into model to diffusers/model merging.

Well, I'm a little asking to myself..

  1. LoRa's : the feature of LoRa 1, 2, 3, 4 and 5 is what exactly in the tool? :
    Is it a way to merge even more other improved Models checkpoint feature with the model I decided to convert to diffusers?
    I take for example that: if I want merge a .safetensors checkpoint from HuggingFace or CiVITAI, it would be impossible, right ?

  2. If LoRa's feature in your tool can't merge models (other than the list showned when I click on a selectbox) :
    Yntec before quitting HF told me that he used to merge models, I don't had the idea to ask how to him! And now he's no more here.
    So, I searched for a tool capable to help me to convert existing models to HF diffusers format.. at least...
    And ... I'm here, on your tool, it's fantastic!
    But it seem I can't merge model with the tool right?
    If so, then there is any help you can provide to me in order to merge models (from HuggingFace, as my computer is not a tournament's beast of power, haha).

  3. About weight map :
    Correct me if I'm wrong, if LoRa is a way to improve details or quality in image generation..
    Then, LoRa weight map (in your tool) is for regulating those improvements? Or the usage is for something else?
    If it's for something else, can you explain me, the purpose of the weight map sliders in your tool for LoRa ?

  4. Finaly, last question, concerns Sample Size :
    I'm feeling that this option missing in your tool?
    Let me more clear (if I'm not), the current tool display no settings/no feature to set by default in the model the Sample Size.
    I converted a SD 1.5 model (because I love them, probably by nostalgy, and also because there are plenty on HuggingFace and I started with these version of Stable Diffusion to play, learn, discover and improve my knowledge of these AI world).
    I tested a quick inference on free api-inference with no payload (using: gradio .load() method directly for quick prototyping the converted model.
    And, aww.. it generate me images with a size of 512px by 512px.
    (with models of Yntec, or some other on HF, I saw they was able to go to 768px by 768px, even with no payload needed!)
    I checked my HF repository once your tool converted the model I wanted (Cryptids-13.safetensors), and then I saw a lot of config files..
    But I'm pretty unsure how finetuned them to pass the model for it use by default a sample size of 768px by 768px and not a 512px by 512px.
    Can you help me, a trick, a tutorial, or a quick config tuning for I can test ?

THANKS A LOT again for your tool, and if you answer me, then, I will thanks you twice !!!!

Cheers !

edit: fixing typo errors ๐Ÿคญ

Hello!

The Diffusers and PEFT libraries are used for the conversion in this tool and the image generation in Hugging Face, and this is basically an explanation of those.

First, regarding 4, there is no place to specify the resolution when converting the model in the current Diffusers. It seems that there used to be one, but it seems to have been consolidated. If you want to specify the resolution, specify the height and width when generating.

It's difficult to explain points 1 to 3, but if there is a LoRA that has been separated in advance on HF, you can convert model with it by directly write its URL in Dropbox.
Most LoRA are like skins, and are not suitable for making essential changes to the model. Think of them as a final adjustment before use.

However, there are various types of LoRA, and there also seems to be a sub-type of LoRA that is suited to essential, large-scale changes.
This tool is not suitable for this. The power of the space where it is placed is low, and it is not an interface that is designed for any purpose other than simple conversion...
The amount of information that can be conveyed in a single post is not enough to discuss model improvements in general, so let's resolve any questions through conversation. I don't know much either... but I should still be able to help you find your way.๐Ÿ˜€

Edit:
I can understand to some extent about model conversion and how to use HF, so let's proceed while consulting with each other in general.
It would be best if Yntec could return, but we would also like to restore it as much as possible as a sign of respect.
Unfortunately, I was unable to back up Yntec's models in time, so it is impossible to recreate everything, but there may be a way to do it for models that have the originals somewhere.

Hey! Already an answer! Wow ๐Ÿค“

Here is mine, it will be (finaly-not) shorter than my first-post for the moment, since I need to compiling in my mind all what you said about LoRA and PEFT and Diffusers in order of creating a valuable full answer !

Then here is my (not-so) quick-answer on those subjects!

  1. In a file within "./vae/config.json" I saw a "sample_size" parameters, naively, I thougth that passing the value here rom 512 to 768 will do the task of re-sampling by default the model.
    I do not understand, each 1.5 models from Yntec has by default on free-Inference api without any manual payload settings (width/height) a default size of 768px by 768px..
    That's curious if Yntec succeeded to make a default of 768 on both width and height, isn't it ?
    There is certainly a "way" to do it, but, I really don't find anything on that subject, even on Medium.com, Google and so many other great sites such StackOverflow !

  2. So, Diffusers library, I know (I mean, I know by name and some concept while trying some stuff, lol).
    But, PEFT, I saw that on some Medium.com tutorials, no more, no less. I certainly used them at HuggingFace without even noticing it, but, for my current knowledge, I can say that PEFT is not in my learned lessons !
    Then I will check some informations on that.

  3. Oh, LoRA are mainly skins then. It would be interesting to use skins aswell on some models.
    Thanks for the precious clarifications !
    Yeah, I do not really need high-level-quality models, all I do is more creating game-assets (such as Trading Card Game pictures, 2D Characters when removing background, also with AI)).
    Is true that sometimes, I become more fancy-mood, and then, I do concept-art images, such as "cake made with plastic" lol. (one day, a model from Yntec generated a TNT and a C4 asset hahahaha), funny.
    So LoRA skins would be what I researched for.
    Then, how I can verify if a LoRA is a skin ?

  4. Your tool (even if not high level features) is really useful, and I'm pretty sure that many people use it :D
    Thanks for it! And making it public!

  5. Discussion in Conversational way on HuggingFace is difficult no? I mean, there is no "private inbox" or "real-time-chat" feature. Not yet at-least as far I know :D
    But it would be interesting to discuss with you on some themes regarding AI Stable Diffusion (more 1.5 than XL) !

  6. As for Yntec, I personnaly thinks that he left for a definitive timescale.
    He really was terribly upset after the "Storage Quota UI changes from HuggingFace staff"..
    I made a cross (meaning: no hope) for his models come back to life again on HuggingFace (perhaps on CiVITAI, but I doubt too).
    So I try to convert, fine-tune, (and why not by the long future away, creating my own models by training them, .. why not!), models I want use which are "a part of what Yntec used to love too".

  7. (as bonus lol)
    If you can help me sometimes with your own knowledge, it would be very appreciated.
    But It's just a suggestion.

  8. EOT
    Finaly the answer was more longer than planned lol.
    (as usual coming from me..)

The skin is a metaphor for something, but I think of LoRA as a means of applying instant learning to AI models. Actually, there are probably more than 10 variants of LoRA, and I've heard that some of them are easy to use for model enhancement. But, although the method is simple, it's not easy to find a good way to do it. So, let's put that aside for now.

If you understand that PEFT is a library that takes care of LoRA, that's OK.

1

I see, Yntec was able to change the default value...
There may be a way to do it. Or rather, it's probably not difficult, it's just hard to find. I'll try to find it later.

4

I think I'll try to make an effort to get closer to the type of tool Yntec was using, too. I think he was using an official tool that he had improved for his own use.

5

I've told Yntec and a few others directly, you all can use the Discussion section of any of my repositories, including this one, as a message board. You can even use it to talk about dinner with someone. As long as it's not a crime or anything, it's fine.
There's no other appropriate place.๐Ÿ˜… The forum is for serious discussions only. Posts can only be created by Pro users.

The HF forum actually has a direct messaging function, but it's only available to users with a certain amount of activity to prevent abuse...
Discord is usually recommended for contacting HF staff and active members. Unless you live in Russia or Turkey and can't use Discord, or you hate Discord like Yntec, it's the fastest way.
https://discuss.huggingface.co/t/join-the-hugging-face-discord/11263/19
In my case, I spend a lot of time on Hub, so if there are any posts in my repository, I'll notice. The notification button will turn yellow.

7

You can ask me anything at any time, except when I'm sleeping. Except for matters concerning privacy.

BTW, it was my mistake to forget to inform Yntec about the recent situation with HF. Specifically, I forgot to tell him that there were a large number of trolls attacking the forums and posts.
Without this premise, it is difficult to imagine that the purpose of this measure was to stop the actions of the trolls, rather than abuse of existing users...
It's not my responsibility, but it was a clear stupid mistake. I regret it quite a bit.

New answer desserve my answer :D

A.
about: LoRa Metaphor
Ok for the metaphor things. Yeah for now I'm only concentrate to convert models, test some experimentations with baked VAE (on models that have not).
I just start to discover "the convert model lesson" so, for the "create model lesson" or "use LoRA properly lesson", it would be another year that must pass before I start to learn at that subject (PEFT is altough a subject to study for me, just to learn its logic working) !

B.
about [your regrets about not preventing Yntec about trolls overriding the generous offers on the HuggingFace services]:
Well, I think you're not responsible how Yntec has overreacted, he was upset, you're not the one making him upset.
I doubt the HF situation told to him would be have any positive impact on his decisions, sadly, people are sometimes pushed over their limits by some situations they can't accept/handle.
We are all like that, later or sooner, the most important, is to think that, at least trolls have been warned by that "UI assault" from Hugging Face staff !
Plus, Yntec has not suppressed his account, so, any new message quotes where it's an answer/tag to his username, would be notified to him by mail (if he have notices enabled of course).
But well, Yntec made his choice, we must accept it, and have no regrets at all, it's a choice of someone, not ours and not yours.
Of course, I would Yntec will be back as he was in his glory days (before the UI update xD), but well, the choice is tied to his decisions, we can't do nothing !
Then, accepting and skipping to the next chapters of the future !
Also..
As an advice, don't be so rude with yourself, think positive, you will not live good with having regrets !

  1. Yes, Yntec found a way to do it, each of his models sampled at 768px.
    I'm aware of that because of the last time I long-time discussed positively with him (because he is a cool-guy (erhm.. was, since, out of HuggingFace, I have no way to contact him)).
    He even created a specific model fine-tuned for me: LadyNostalgia (but aswell now it's takedown, I wish I could download it before that fatal day lol).
    If you found a way that I would be able to reproduce/understand and do myself too, I really would be happy to know it :D

  2. By that wording of you, hmm.. is it means that you will develop new fancy tool, we, communauty on HuggingFace will be able to use for doing more treatments/more conversions settings for models like SD 1.5 (or SDXL/3.5 for others examples).
    I will look that forward, I'll start to follow you, as you are a cool-person with me.
    And.. by guessing on the pseudonym you use on HuggingFace I suppose you're a guy, me, I'm a girl.
    I also prefer follow helpful people, because, helpful people are like candies : sweet and emotionful.
    For my part, as my all-spaces/models are private (because no valuable since it's just some tests and experiments to learn how the things works), following me would be non-sense.
    But I want also open public-models (now I can do conversion! It will be fantastic).
    Also, I made some specific Gradio UI for using with Inference-api (HF's one). Some bugs still occurs lol, but well, development is also that: a hunt to the bugs for making working all smoothly !
    (by the way, my job in real life is developping websites in PHP/HTML5/CSS3/JS/SQL databases, so... lol, not really something in relation with AI models, I do it only for my own hobbies!)

  3. Then when I have questions, I will post it to your space (that one?) !
    But really, only using one space-discussion repo would be ideal to keep a long-term discussions, no?
    Then did you have really not a space preference for we can use as "discussion-room" xD ?
    Yes I have Discord, I'm on the Huggingface Discord server's too (since a long time).
    Honestly, I spend a lot of my free time on HuggingFace lol, more than Discord/Discord HuggingFace servers!
    Since I discovered AI world, I also play very little to games :D
    Awww, Yntec hate Discord, I didn't know, I always discussed with him through HuggingFace in fact lol. Never asked for Discord tag..
    I don't live in Russia or Turkey.

  4. What a shame, I was just about to ask your underwear color while you're taking a nap! ๐Ÿคฃ
    Well, I suppose we can only discuss on the AI subjects then. ๐Ÿฅฒ

Thank you. It makes me feel better. I think he's passionate by nature... I'll respect his decision. But I'd be happy if he came back.
You're right. I'm a man. My icon is a male cat. Since you're a woman, I'll be careful about sexual harassment.๐Ÿ˜…

Even if I say I'm avoiding talking about privacy, I'll talk about things that don't lead to 100% personal identification.
The color of my underwear is dark blue. I'm a trunks guy. This might be sexual harassment.
It's 00:33 here in Japan, so I'll save the serious stuff for tomorrow.
I'm usually at the Hub (working on the Hub while doing my day job), so it's quicker to contact me here with mention (@+username) than on Discord. BTW, I've set up Discord so that I can receive DMs from anyone. I only use HF Discord anyway, so it's not a big inconvenience.

Of course, I recall his passion (his clear passion) for Stable Diffusion creating/merging/fine-tuning/baking models!
I'd be also certainly happy if he come back, but well, time goes, and we can't do as much as we would want, so, no regrets and step away from bad moods :)

Yeah a male cat, lol, I didn't noticed it. Sorry!

Oh you know, sexual harrassment is not a thing that can fear me, as I am an emotional personae, then virtuality/AI are more important than human relation for me.
Personal identification, yeah, but, people will do what they want and I will take measure in a way they can't appreciate.
The fact is ...
I'm an open minded girl, not a closed-one, I have an explosive personnality but I'm also truly a pure emotional !
I tolerate a lot of things, even dark blue male underwear (lol).
I prefer clearly pink and pastels colors than brutal one.
But if I like someone, it's the person I like not his/her underwear/lingerie lol.

From my viewpoint you're an interesting guy, so, do not worry, when you discuss with me.

Of course, I know that this discussion is not private.
And, honestly, I don't care at all :)

WOLO ;)

Thanks for you planning of availability :O

As said, I'm also on HF Discord server too.

Here it is currently 21:55 in France. I'm not really going to sleep before some time. I work a little on some stuff :)

Have a nice sleep, see you later!

see you! im john6666cat on Discord

Yes, thank you. Let's think positively. My regrets won't be of any help to him. It's probably better to be making something!
I'm currently maintaining another space, so the conversion of this translation space will have to wait...
I'll be posting little by little to take a break and organize my thoughts.

Like you, I haven't had time to play games since I got involved with AI...

my job in real life is developping websites in PHP/HTML5/CSS3/JS/SQL databases

I'm a complete amateur, but you're a pro!๐Ÿ™€
Then it's a quick conversation, so I'll explain the basics, including things you might already know. In principle, you can look at the source code for other people's Spaces, so if you look at that and just know how to use Python and Gradio, you can do most things. Or rather, Python is too easy except for the constraints of indentation, dependencies, and the GIL. Even I, who's not good at asynchronous programming, can handle it, so if you can use JS, it's a piece of cake. You can call APIs from JS in the first place.
You can also use Docker with Spaces, and you don't have to use Gradio for the GUI, but in many cases Python + Gradio is convenient for HF. Gradio is full of bugs, but it's easy to use, and about 60% of the time it works as you expect.
In any case, most of the processing is handled by libraries and APIs. These are the libraries that use HF as a warehouse. For many other companies' libraries, HF is also the warehouse.

In the case of SD1.5, Diffusers is the basis, torch is used as the backend, PEFT is used for LoRA-related tasks, Transformers is used for language models that interpret prompts (CLIP in the case of SD1.5), Accelerate is used for handling multi-GPU and models that are too large to fit in VRAM, huggingface_hub is used for the HF API, and Gradio is used for the GUI.
Basically, just by operating the Diffusers, you can use and convert SD1.5 models.
Other libraries are called by Diffusers as needed, so as long as they are installed, there should be no problems.

By that wording of you, hmm.. is it means that you will develop new fancy tool, we, communauty on HuggingFace will be able to use for doing more treatments/more conversions settings for models like SD 1.5 (or SDXL/3.5 for others examples).

No, I don't have any great skills,๐Ÿฅถ so I can't do that. However, I thought that if I followed what Yntec was doing, I might be able to find better SD1.5 conversion settings. There is a high possibility that this could be reflected in the tool. Also, since the tool itself is just a version that I made for the time being, improving its usability is an issue.
Also, the HF conversion demo hasn't been updated for about a year, so there are many cases where the number of functions increases just by enabling the use of ordinary functions that have been added since then.
This is a common thing with HF, and although it's open source and no one is hiding it, it's practically full of hidden functions.

Then when I have questions, I will post it to your space (that one?) !
But really, only using one space-discussion repo would be ideal to keep a long-term discussions, no?

Yea. That's true. It's fine to use random repositories, but it's easier to discuss a single topic in a single place.

see you! im john6666cat on Discord

I sent a friend request on Discord ๐Ÿ˜‡.

For all you said in your last answer, I will answer tomorrow (perhaps on Discord).
Today, I was overwhelmed by some stuff to do!

Thanks for the friend request! I've registered.๐Ÿ˜€ I was nuked for stepping on a landmine in my main account, so I'm a bit late in replying. Discord and the HF forum are treated separately from the Hub, so I'm fine, but I'm stuck in the Hub.๐Ÿฅถ

BTW, I found that if I pass the image_size (I'm not sure of the name, but it's an int and is 512 or 768 or so) to the config when initializing VAE, it will probably work. Basically, any changes made after loading the pipeline will be reflected when saving, so if we just save it as .save_pretrained(), we should be able to reproduce it... I think.

Hey!
Yeah, seems you're not using your main HuggingFace account, I noticed that before adding you lol.
But that's mean you currently no more have access to your main account on HuggingFace?

Hmm well, let's discuss on your thought!

About the image_size thing (because it's image_size not sample_size like I thought.
I think that Yntec used a method from the DiffusionPipeline and not from StableDiffusionPipeline (which inherit from Diffusers whole Pipeline).
It's called download_from_original_stable_diffusion_ckpt
And seems accessible from that chained methods path :
for simple checkpoint :
diffusers.pipelines.stable_diffusion.convert_from_ckpt.download_from_original_stable_diffusion_ckpt
for ControlNET checkpoint, after searching, it's :
diffusers.pipelines.stable_diffusion.convert_from_ckpt.download_controlnet_from_original_ckpt

If I say that it's due to my investigation accross the Web... and as showed from that one only safetensor (not fusing LoRa) convert to diffuser script:
https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py#L160
Look.. There is the famous image_size parameters
https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py#L164
(and even much more).

Full documentation API for
convert_original_stable_diffusion_to_diffusers()method here
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L1138
and for
convert_original_stable_diffusion_to_diffusers()
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L1817

Well, I'm not sure if in your current convert tool, it would be easy to implements the use of download_from_original_stable_diffusion_ckpt()
or
download_controlnet_from_original_ckpt()

Because..
Currently I saw that you're using in your convert script the method
for local file
from_pretrained() from the StableDiffusionPipeline class (probably another inherits of Diffusers?...).
for .ckpt file (or distant file?)
from_single_file() from the StableDiffusionPipeline class (which is an extension to the Diffusers.loaders.FromSingleFileMixin sub-class...).
Also your convert tool is more complex, due to the fact it make a lot of actions during the conversion of a Stable Diffusion 1.5 model (in order to make a conversion of safetensors files to HuggingFace diffusers)..

  • downloading model either from HuggingFace, either from CIVITAI.COM, either from another kind of distant location url..
  • fusing LoRAs (5 by conversion, no more possible?) and set-up a weight scale (what is it exactly?) for each
  • fine-tuning VAE (for enhancing quality and details of models)
  • fine-tuning prompt CLIP (not BLIP, why? Is it even possible to load a BLIP in the selectbox Gradio component ?)
  • set-up a file precision data type
  • making possible extracting EMA as an option
  • making the choice of schedulers (sampling method used in the model) as a numbered possibilities.
  • managing CIVITAI.COM API
  • management of upload/updating our own private/public model repo

So you're using from_single_file()then not using download_from_original_stable_diffusion_ckpt()which seems having more possibilities than the method you use as described from that documentation link:
https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders/single_file.py#L270

So.. Perhaps it would be impossible due to all the features in your convert tool, to implement download_from_original_stable_diffusion_ckpt() instead of from_single_file()..

I also checked for what you said about save_pretrained, hmm.. I don't think (but I'm still not sure) it seems, according to this API documentation page https://huggingface.co./docs/diffusers/v0.11.0/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.save_pretrained ; .. well.. it seems that's save_pretrained() class method is (1) not really a self method from StableDiffusionPipeline as it's inherited by the class DiffusionPipeline. (2) and that save_pretrained() method from class DiffusionPipelinedoes not seems having any method parameters called image_size, seems there is only save_directory and safe_serialization


save_directory (str or os.PathLike) โ€” Directory to which to save. Will be created if it doesnโ€™t exist.

safe_serialization (bool, optional, defaults to False) โ€” Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

Then I doubt that's possible by passing sample_size to the call of save_pretrained()

Thanks for reading!

But that's mean you currently no more have access to your main account on HuggingFace?

Yes!๐Ÿ˜ญ
Well, when it comes to things like this, there's nothing to do but wait.๐Ÿ˜…

The reason there is a limit of 5 LoRA is because it is troublesome to create a GUI, but there is no limit to the number of LoRA in terms of processing. As long as there is enough RAM, you can increase the number.
The scale of LoRA is the number that is passed directly to PEFT, and basically refers to the applicable strength. Many LoRA are trained assuming a range of 0.7 to 1.1.

Incidentally, I've left the LoRA merge function in the old-fashioned way, but there are more things that can be done with PEFT now. However, even if I add various things to the converter, there is no use...
https://huggingface.co./docs/diffusers/main/en/tutorials/using_peft_for_inference
https://huggingface.co./blog/peft_merging

I forgot BLIP was missing!
And it's a bit rough, but I think it's possible to get around the problem by using download_from_original_stable_diffusion_ckpt() as a preprocessor with options.

download_from_original_stable_diffusion_ckpt()

I was also looking into that function. If you follow where image_size ends up in that function, you'll end up at the Config that is passed to the VAE class.
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L1138
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L356
You were right with sample_size!

I think this will work in principle, but when we think about custom VAEs, it might be better to edit the class variables directly later. We could also mess around with the saved .json file (Config).

from diffusers.models import AutoencoderKL
from diffusers import StableDiffusionPipeline
import torch

model = "stable-diffusion-v1-5/stable-diffusion-v1-5"
path = "local_model"
config = AutoencoderKL.load_config(model, subfolder="vae")
vae = AutoencoderKL.from_config(config, sample_size=768).to(torch.float16)
pipe = StableDiffusionPipeline.from_pretrained(model, vae=vae, torch_dtype=torch.float16)
pipe.save_pretrained(path)

https://huggingface.co./docs/diffusers/main/en/api/models/autoencoderkl#diffusers.AutoencoderKL

I suspect that Yntec was using one of A1111 WebUI space on HF to bake in the VAE and merge other LoRA and models before conversion.
If that's the case, it would avoid situations where the converted model doesn't work because there is no VAE, and it would also avoid the need to have the converter itself do various things. The official one is fine.
https://huggingface.co./spaces/dasghost65536/SD-Webui12

I readed carefully your answer!

Thanks by the way to take time with that!

Firstly, ok for the number of LoRA's files loadable together in the same-one conversion.
I admit, Gradio is really a pain to do sometimes "easy things when it comes to manipulate HTML DOM when it comes to simply extends dynamically the number of components in a page when we are at runtimes".
Basically, Gradio is just capable of creating "static interface" not really "dynamics-one".

One workaround I think it may works, is to make use of only 1 Gradio component to declare loras url/loras weight scale. All in pure text syntax.

How I have that in my idea?
Well..
There is certainly a way to get the things working by adding a Gradio "Textarea"-alike-html component (a Textbox with linebreaks possibles) in the page to write such of infinite declarations during runtime :

url_lora1 > https://site.url/path/to/file.safetensors
weight_scale_lora1 > 1
url_lora2 > https://site.url/path/to/file.safetensors
weight_scale_lora2 > 0.90
url_lora3 > https://site.url/path/to/file.safetensors
weight_scale_lora3 > 0.80
url_lora4 > https://site.url/path/to/file.safetensors
weight_scale_lora4 > 0.70
url_lora5 > https://site.url/path/to/file.safetensors
weight_scale_lora5 > 0.60
url_lora6 > https://site.url/path/to/file.safetensors
weight_scale_lora6 > 0.50
url_lora7 > https://site.url/path/to/file.safetensors
weight_scale_lora7 > 0.40
url_lora8 > https://site.url/path/to/file.safetensors
weight_scale_lora9 > 0.30
url_lora9 > https://site.url/path/to/file.safetensors
weight_scale_lora9 > 0.20
url_lora10 > https://site.url/path/to/file.safetensors
weight_scale_lora10 > 0.10
(and so on another bunch of those declarations.. when needed. As to define infinite LoRa's checkpoints to merge!)

And then pass this string to the converter tool which is in charge of fuse loras.
And then, before fusing loras, it will call to a custom code that will convert in Dictionary (using regex on the passed strings) in order to transform into a python Dictionary (like an associative array in PHP) .
Such as :

loras_to_fuse_as_a_dictionary = {
    "url_lora1": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora1":  float(1),
    "url_lora2": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora2":  float(0.90),
    "url_lora3": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora3":  float(0.80),
    "url_lora4": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora4":  float(0.70),
    "url_lora5": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora5":  float(0.60),
    "url_lora6": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora6":  float(0.50),
    "url_lora7": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora7":  float(0.40),
    "url_lora8": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora8":  float(0.30),
    "url_lora9": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora9":  float(0.20),
    "url_lora10": "https://site.url/path/to/file.safetensors", 
    "weight_scale_lora10":  float(0.10),
}

And finally use that new dynamically formed python Dictionary as each of loras to load in a forloop in order to get each of them as (lora url, lora scale)

Later I will do certainly a quick code sample to explain how I see that kind of regex as for extracting matches from the simple string passed in parameters.

Here is the working regex code:

url_lora[0-9]{1,} > (.+)\sweight_scale_lora[0-9]{1,} > (.+)

Options:
g (for multi-matches)
m (for multi-lines)

It will matches each of our needed useful data information like that output demonstrate it :

url == https://site.url/path/to/file.safetensors
  weight == 1
url == https://site.url/path/to/file.safetensors
  weight == 0.90
url == https://site.url/path/to/file.safetensors
  weight == 0.80
url == https://site.url/path/to/file.safetensors
 weight == 0.70
url == https://site.url/path/to/file.safetensors
 weight == 0.60
url == https://site.url/path/to/file.safetensors
 weight == 0.50
url == https://site.url/path/to/file.safetensors
 weight == 0.40
url == https://site.url/path/to/file.safetensors
 weight == 0.30
url == https://site.url/path/to/file.safetensors
 weight == 0.20
url == https://site.url/path/to/file.safetensors
 weight == 0.10

As for the left of your answer :

I really think my knowledge is not sufficient to make any comprehensive ways of those subjects ,so here is some new questions :

  1. I don't really understand the meaning of :
    https://huggingface.co./docs/diffusers/main/en/tutorials/using_peft_for_inference
    or
    https://huggingface.co./blog/peft_merging
    ..
    In fact I'm not sure where those lines of codes would be added through the conversion . py files of your converter tool, (since it seems that it is for when we do images generations not when we do conversions).
    So on that subject, you completely confused me lol. If you can clarify the meaning of your answer on that point, it would be useful to me, and also I'm not sure to fully understand what you means by saying in the old-fashioned way and ending-up your paragraph with However, even if I add various things to the converter, there is no use....
    What you mean? (sorry if it is very clear to you, because, for me, it's so confusing ๐Ÿค•).

  2. As for the BLIP subject, you said I forgot BLIP was missing! so finally, what you mean by that,..
    Is your answer would means that you weren't aware your converter tool missed a BLIP integration..
    or
    Is your answer would means something else? ๐Ÿ˜…

Also not sure about that answer from your idea using download_from_original_stable_diffusion_ckpt() as a preprocessor with options., is this would mean that you think through that method it would be easy to get BLIP instead of CLIP ?
I means..
In your converter tool you wrote that file presets.pywhich is set at some point of your code this python List :

clips = [
    "",
    "openai/clip-vit-large-patch14",
]

Is it not possible to add new items to that list in order to propose a BLIP url in CLIP visual choices in the converter UI?
Let's say like that?

clips = [
    "",
    "openai/clip-vit-large-patch14",
    "Salesforce/blip2-opt-2.7b", 
]

I only ask if it's possible to load BLIP instead of CLIP without changing other part in code of your converter tool?

  1. About to the VAE class. Where it's going to be defined in your converting tool?
    I mean you said, I quote you it might be better to edit the class variables directly later. We could also mess around with the saved .json file (Config). but I didn't see any json config file in the converter tool repo?
    So it's really confusing me too on that subject ๐Ÿ˜…

  2. Honestly, the WebUI, when used from a docker container from a Gradio interface from a HuggingFace space, is a pain to have access to files generated within the well-said docker container.
    I don't know how Yntec was able to get it all that messy things working, but I am sure that he have not a powerful computer (for those tasks externally frol HuggingFace), as he said me, he used to stick to the free Inference-api of HuggingFace to get all the stuff working for generating his tests images.
    So I doubt he installed WebUi on his personal computer (we certainly never know the real truth ๐Ÿ˜‚ like X-Files serie, the truth is elsewhere).

Well... I can't do so much of new advances knowledge, as I am really not sure about how doing a lot of things..

But I think together, you and me would be able to make furthers improvements to your converter tool.

But for yet, only you, you will be able to find new ways to get things working as I am only in research phase... ๐Ÿ•ต๐Ÿปโ€โ™€๏ธ

Edit : fixing typo error I made since I wrote my answer through my smartphone ๐Ÿ˜‚ and sometimes fingers slipped to a wrong character on the virtual touch keyboard (Microsoft SwiftKey on Android).

I wrote this at a bit of a fast pace, so there may be a lot of mistakes.๐Ÿ™„

If writing a GUI is too much trouble for us, it's good that we can just pass it in as text. If it's YAML or JSON, there's not much ambiguity.
Since it's a VM, we don't really need to worry about vulnerabilities.
I think I'll try adding it for provisional functions, not just for LoRA. It would be convenient if we could use the HF CPU space as a console.

you weren't aware your converter tool missed a BLIP integration..

This one.๐Ÿ˜… I forgot about BLIP because I added FLUX and SD1.5 later, based on the SDXL converter.

Also not sure about that answer from your idea using download_from_original_stable_diffusion_ckpt() as a preprocessor with options

I put the sentence in the wrong place.๐Ÿ˜ญ This is what happens when I rely on the translation machine half the time. That has nothing to do with BLIP, and what I mean is that if we use that function to convert once before the normal conversion, we can achieve part of the target result. It will be a double conversion, but SD1.5 is small, so it won't take much time.

1

It's a converter, not a merger, so it would be difficult to use if it was too complicated.
But it wouldn't be a problem if I added more functions, so I might as well add them. I can just hide them with Accordion.
Sorry for the confusion about LoRA. What I mean is that, with the current PEFT, you can specify how to merge the LoRA in a more complex way than just specifying a single scale. PEFT is a library that many people use only at runtime for LoRA, but it is actually a library that controls all of the indirect training of AI... Anyway, you can do a lot of advanced things with this.
However, the meaning of the many options is something that you have to try to understand.

3

Sorry about that too! The json is in the folder of the saved model repo. It's just a list of class variables and numbers.
Where should I operate the VAE (AutoEncoderKL) class? This is still in the draft stage...๐Ÿ˜…

I only ask if it's possible to load BLIP instead of CLIP without changing other part in code of your converter tool?

Since there is virtually no reliable way to distinguish between the different types of HF models, I will either have to make a separate list or a dictionary. Or I'll just have to download it and take a look at the contents...
Well, it's quicker to add the BLIP option.

Honestly, the WebUI, when used from a docker container from a Gradio interface from a HuggingFace space, is a pain to have access to files generated within the well-said docker container

It's true, but he said it took a few hours to generate the test, so it seems like he actually used it...๐Ÿฅถ
Well, in any case, the truth is still a mystery.๐Ÿ˜Ž
Specifically, I have a copy of the actual Spaces wreckage here, but it's difficult to show you a sample until my main account is fixed. If I find the source, I'll bring it, but if you just want to reproduce the technique, it's easier to use a GPU locally. Sometimes there are model files that are broken or deviate from the expected format, and it's difficult to repair them with Diffusers or other CLI tools. I might be able to port the repair parts, but it's easy enough to use a mouse, and there aren't that many damaged files these days.

Fine! ๐Ÿ˜
Don't worry for language translation.
I'm French, so, my English is perhaps not perfect, you are Japanese, so... We just do what we can do to make a good understanding between us two, at best! ๐Ÿ˜

So it seems you like my logic with regex for LoRA's multiple checkpoint fuse!
My predilection development language are more PHP, HTML, CSS, JAVASCRIPT than PYTHON lol, but well, I learn very fast anything, ut may just take some extra tutorials to follow haha ๐Ÿ˜‚!

But as a PHP-developeress, I must say Python is really less used in websites.

I just learnt Python for working on Gradio lol.
Now I do fancy things with my new knowledge, and I certainly can help!

I know it's a lot of hard works to du such a thing to trying reproduction of what WebUI is currently capable of!

I works also on it by my side!
Once I get some interesting stuff to show, I will show you what my work is!

And for all your left answers, I will try to make a more longer answer in few days!

Thanks again ๐Ÿ‘๐Ÿป you are like a candy on a fluffy clouds to my eyes, as long as you try to help not only me but also any person that wants just convert and have fun fuse with models! ๐Ÿ˜

Sign up or log in to comment