Not working on Colab

#1
by Krummrey - opened

I can get it to run, but it does not produce the output I expected:

"Cutting Knowledge Date: December 2023
Today Date: 26 July 2024

You are a helpful image captioner._firestore"

or

"# 1. Introduction

We have a collection of 30,000 images that we would like to use to train a stable diffusion model. We want to use a training prompt that will ensure the model generates images that are similar to our collection. We also want to use a stable diffusion model that can generate images that are similar to our collection."

This is the log:
"2024-10-05 19:21:00.992473: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-05 19:21:01.012372: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-05 19:21:01.018288: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-05 19:21:01.032313: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-05 19:21:02.411498: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Running on cuda
Loading in NF4
Loading CLIP πŸ“Ž
Loading VLM's custom vision model πŸ“Ž
Loading tokenizer πŸͺ™
Loading LLM: unsloth/Meta-Llama-3.1-8B-bnb-4bit πŸ€–
VLM's custom text model isn't loaded πŸ€–
Loading image adapter πŸ–ΌοΈ
..."

It seems that some modules are not working in the Colab environment and are returning strange results. I may need to create a branch for the Colab environment. I don't know much about it, but I have seen other programs do so.
I will look into it.

Sorry. Seems I can't do it.πŸ˜”
I figured out how to determine if it is a Colab environment and branch in Python. But I can't figure out what's wrong with the Colab environment, so I don't know what to do after branching.
Perhaps the version of torch or CUDA or tensort is not the right version given the nature of the problem, but I don't know which version of torch and CUDA is recommended by the original JoyCaption, so I have no way to remedy the problem.
https://python-jp.dev/articles/345620546

A version of Wi-zz's might work with Colab.
https://huggingface.co./Wi-zz/joy-caption-pre-alpha

@dominic1021 Do you know anything about the compatibility of Colab and JoyCaption's dependencies?

I can get it to run, but it does not produce the output I expected:

"Cutting Knowledge Date: December 2023
Today Date: 26 July 2024

You are a helpful image captioner._firestore"

or

"# 1. Introduction

We have a collection of 30,000 images that we would like to use to train a stable diffusion model. We want to use a training prompt that will ensure the model generates images that are similar to our collection. We also want to use a stable diffusion model that can generate images that are similar to our collection."

This is the log:
"2024-10-05 19:21:00.992473: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-05 19:21:01.012372: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-05 19:21:01.018288: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-05 19:21:01.032313: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-05 19:21:02.411498: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Running on cuda
Loading in NF4
Loading CLIP πŸ“Ž
Loading VLM's custom vision model πŸ“Ž
Loading tokenizer πŸͺ™
Loading LLM: unsloth/Meta-Llama-3.1-8B-bnb-4bit πŸ€–
VLM's custom text model isn't loaded πŸ€–
Loading image adapter πŸ–ΌοΈ
..."

@dominic1021 Do you know anything about the compatibility of Colab and JoyCaption's dependencies?

Yea, I know. I made a working notebook with the latest joycaption with batch image processing capabilities, saving the captions in text files for easier training of sd models (for image1.png you get image1.txt caption and so on).
Check this out: https://huggingface.co./BullseyeMxP/joy-caption-alpha-two/blob/main/Bullseye_joycaption_alpha_two.ipynb
My Discord if willing to collaborate: bullseye3886.

Incredibly, I don't have a Discord account.
But maybe I am thinking too hard. I think this is just because I didn't go through Path and LoRA is not applied.
I can create a branching function, so I might try using your Path.

Your code

CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two/cgrkzexw-599808")

My code

BASE_DIR = Path(__file__).resolve().parent # Define the base directory

CHECKPOINT_PATH = BASE_DIR / Path("cgrkzexw-599808")

New code

CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808") # Like this?

Incredibly, I don't have a Discord account.
But maybe I am thinking too hard. I think this is just because I didn't go through Path and LoRA is not applied.
I can create a branching function, so I might try using your Path.

Your code

CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two/cgrkzexw-599808")

My code

BASE_DIR = Path(__file__).resolve().parent # Define the base directory

CHECKPOINT_PATH = BASE_DIR / Path("cgrkzexw-599808")

New code

CHECKPOINT_PATH = Path("/content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808") # Like this?

May be.

https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/app.py
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/joycaption_alpha_two_cli_mod.ipynb

How about this? The code for the logic part is almost the same as the GUI version, so if it works, it should work...

My GUI ver.

https://huggingface.co./spaces/John6666/joy-caption-pre-alpha-mod

Gonna test if I'll have some spare time.

Thanks.

https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/app.py
https://huggingface.co./John6666/joy-caption-alpha-two-cli-mod/blob/main/joycaption_alpha_two_cli_mod.ipynb

How about this? The code for the logic part is almost the same as the GUI version, so if it works, it should work...

My GUI ver.

https://huggingface.co./spaces/John6666/joy-caption-pre-alpha-mod

Nope, that produced similar random things but not descriptions of the images.
it still says: "VLM's custom text model isn't loaded πŸ€–"

BASE_DIR: /content/joy-caption-alpha-two-cli-mod
LORA_PATH: /content/joy-caption-alpha-two-cli-mod/cgrkzexw-599808/text_model

They seem to be pointing to the right locations

Thanks for the operation test.

Nope, that produced similar random things but not descriptions of the images.

Then there's another cause.
But what on earth does that mean?
The most likely cause is a malfunction of the LLM part, but it is not so common to see a pattern where the LLM operates itself but operates abnormally.

it still says: "VLM's custom text model isn't loaded πŸ€–"

It is supposed to do so if the file is not found or in NF4 mode.
It should be applicable in NF4 mode as well, but I omitted it because bitsandbytes or PEFT is buggy and causes errors.
The bf16 option disables NF4 mode and LoRA is applied. This may fix the behavior.

However, the real question is why the CUI version, which does not change anything in the logic part and library dependencies, gives errors in Colab, and if this is not solved, potential errors will continue to appear. I think this is an incompatibility issue between HF's Zero GPU space and Colab, not just my script.
It's just a guess, but there is a problem with some library or its version that is implicitly installed or not installed.
In GUI, is that dependency overwritten by installing gradio and spaces?

I think you use the wrong base version, instead of unsloth/Meta-Llama-3.1-8B-bnb-4bit use unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit. The original author change this in alpha-two version. Swap it out, and work as expected

DEFAULT_MODEL_PATH = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"

to

DEFAULT_MODEL_PATH = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

And the colab path is not necessary (it work the same either way)

So that's how it was...
My GUI demo also uses a different Llama3 model by default. That's why I don't see the problem.
I'll fix that when I get home.

But if it makes a difference, does it make that much difference?πŸ€”
It's so obvious at first glance that it's not right. It's strange.

I fixed them. alpha_one_cli's model is still the same, I just added a comment.
I also reverted the working directory acquisition.
I left IS_COLAB as it might be useful for some people.
It would be easier if it works now.

Sign up or log in to comment