Not working on Colab
I can get it to run, but it does not produce the output I expected:
"Cutting Knowledge Date: December 2023
Today Date: 26 July 2024
You are a helpful image captioner._firestore"
or
"# 1. Introduction
We have a collection of 30,000 images that we would like to use to train a stable diffusion model. We want to use a training prompt that will ensure the model generates images that are similar to our collection. We also want to use a stable diffusion model that can generate images that are similar to our collection."
This is the log:
"2024-10-05 19:21:00.992473: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-05 19:21:01.012372: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-05 19:21:01.018288: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-05 19:21:01.032313: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-05 19:21:02.411498: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Running on cuda
Loading in NF4
Loading CLIP π
Loading VLM's custom vision model π
Loading tokenizer πͺ
Loading LLM: unsloth/Meta-Llama-3.1-8B-bnb-4bit π€
VLM's custom text model isn't loaded π€
Loading image adapter πΌοΈ
..."
It seems that some modules are not working in the Colab environment and are returning strange results. I may need to create a branch for the Colab environment. I don't know much about it, but I have seen other programs do so.
I will look into it.
Sorry. Seems I can't do it.π
I figured out how to determine if it is a Colab environment and branch in Python. But I can't figure out what's wrong with the Colab environment, so I don't know what to do after branching.
Perhaps the version of torch or CUDA or tensort is not the right version given the nature of the problem, but I don't know which version of torch and CUDA is recommended by the original JoyCaption, so I have no way to remedy the problem.
https://python-jp.dev/articles/345620546
A version of Wi-zz's might work with Colab.
https://huggingface.co./Wi-zz/joy-caption-pre-alpha
@dominic1021 Do you know anything about the compatibility of Colab and JoyCaption's dependencies?
I can get it to run, but it does not produce the output I expected:
"Cutting Knowledge Date: December 2023
Today Date: 26 July 2024You are a helpful image captioner._firestore"
or
"# 1. Introduction
We have a collection of 30,000 images that we would like to use to train a stable diffusion model. We want to use a training prompt that will ensure the model generates images that are similar to our collection. We also want to use a stable diffusion model that can generate images that are similar to our collection."
This is the log:
"2024-10-05 19:21:00.992473: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-05 19:21:01.012372: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-05 19:21:01.018288: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-05 19:21:01.032313: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-05 19:21:02.411498: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Running on cuda
Loading in NF4
Loading CLIP π
Loading VLM's custom vision model π
Loading tokenizer πͺ
Loading LLM: unsloth/Meta-Llama-3.1-8B-bnb-4bit π€
VLM's custom text model isn't loaded π€
Loading image adapter πΌοΈ
..."
@dominic1021 Do you know anything about the compatibility of Colab and JoyCaption's dependencies?
Yea, I know. I made a working notebook with the latest joycaption with batch image processing capabilities, saving the captions in text files for easier training of sd models (for image1.png you get image1.txt caption and so on).
Check this out: https://huggingface.co./BullseyeMxP/joy-caption-alpha-two/blob/main/Bullseye_joycaption_alpha_two.ipynb
My Discord if willing to collaborate: bullseye3886.
Incredibly, I don't have a Discord account.
But maybe I am thinking too hard. I think this is just because I didn't go through Path and LoRA is not applied.
I can create a branching function, so I might try using your Path.
Your code
CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two/cgrkzexw-599808")
My code
BASE_DIR = Path(__file__).resolve().parent # Define the base directory
CHECKPOINT_PATH = BASE_DIR / Path("cgrkzexw-599808")
New code
CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808") # Like this?
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/app.py
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/joycaption_alpha_two_cli_mod.ipynb
How about this? The code for the logic part is almost the same as the GUI version, so if it works, it should work...
My GUI ver.
https://huggingface.co./spaces/John6666/joy-caption-pre-alpha-mod
Incredibly, I don't have a Discord account.
But maybe I am thinking too hard. I think this is just because I didn't go through Path and LoRA is not applied.
I can create a branching function, so I might try using your Path.Your code
CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two/cgrkzexw-599808")
My code
BASE_DIR = Path(__file__).resolve().parent # Define the base directory CHECKPOINT_PATH = BASE_DIR / Path("cgrkzexw-599808")
New code
CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808") # Like this?
May be.
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/app.py
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/joycaption_alpha_two_cli_mod.ipynbHow about this? The code for the logic part is almost the same as the GUI version, so if it works, it should work...
My GUI ver.
https://huggingface.co./spaces/John6666/joy-caption-pre-alpha-mod
Gonna test if I'll have some spare time.
Thanks.
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/app.py
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/joycaption_alpha_two_cli_mod.ipynbHow about this? The code for the logic part is almost the same as the GUI version, so if it works, it should work...
My GUI ver.
https://huggingface.co./spaces/John6666/joy-caption-pre-alpha-mod
Nope, that produced similar random things but not descriptions of the images.
it still says: "VLM's custom text model isn't loaded π€"
BASE_DIR: /content/joy-caption-alpha-two-cli-mod
LORA_PATH: /content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808/text_model
They seem to be pointing to the right locations
Thanks for the operation test.
Nope, that produced similar random things but not descriptions of the images.
Then there's another cause.
But what on earth does that mean?
The most likely cause is a malfunction of the LLM part, but it is not so common to see a pattern where the LLM operates itself but operates abnormally.
it still says: "VLM's custom text model isn't loaded π€"
It is supposed to do so if the file is not found or in NF4 mode.
It should be applicable in NF4 mode as well, but I omitted it because bitsandbytes or PEFT is buggy and causes errors.
The bf16 option disables NF4 mode and LoRA is applied. This may fix the behavior.
However, the real question is why the CUI version, which does not change anything in the logic part and library dependencies, gives errors in Colab, and if this is not solved, potential errors will continue to appear. I think this is an incompatibility issue between HF's Zero GPU space and Colab, not just my script.
It's just a guess, but there is a problem with some library or its version that is implicitly installed or not installed.
In GUI, is that dependency overwritten by installing gradio and spaces?
I think you use the wrong base version, instead of unsloth/Meta-Llama-3.1-8B-bnb-4bit
use unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
. The original author change this in alpha-two version. Swap it out, and work as expected
DEFAULT_MODEL_PATH = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
to
DEFAULT_MODEL_PATH = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
And the colab path is not necessary (it work the same either way)
So that's how it was...
My GUI demo also uses a different Llama3 model by default. That's why I don't see the problem.
I'll fix that when I get home.
But if it makes a difference, does it make that much difference?π€
It's so obvious at first glance that it's not right. It's strange.
I fixed them. alpha_one_cli's model is still the same, I just added a comment.
I also reverted the working directory acquisition.
I left IS_COLAB as it might be useful for some people.
It would be easier if it works now.