|
--- |
|
license: unlicense |
|
datasets: |
|
- poloclub/diffusiondb |
|
language: |
|
- en |
|
metrics: |
|
- wer |
|
pipeline_tag: image-to-text |
|
--- |
|
# Untitled7-colab_checkpoint |
|
|
|
This model was lovingly named after the Google Colab notebook that made it. It is a finetune of Microsoft's [git-large-coco](https://huggingface.co./microsoft/git-large-coco) model on the 1k subset of [poloclub/diffusiondb](https://huggingface.co./datasets/poloclub/diffusiondb/viewer/2m_first_1k/train). |
|
|
|
It is supposed to read images and extract a stable diffusion prompt from it but, it might not do a good job at it. I wouldn't know I haven't extensivly tested it. |
|
|
|
As the title suggests this is a checkpoint as I formerly intended to do it on the entire dataset but, I'm unsure if I want to now... |
|
|
|
This is my first public model so please be nice! |
|
## Intended use |
|
|
|
Fun! |
|
|
|
```python |
|
# Load model directly |
|
from transformers import AutoProcessor, AutoModelForCausalLM |
|
|
|
processor = AutoProcessor.from_pretrained("SE6446/Untitled7-colab_checkpoint") |
|
model = AutoModelForCausalLM.from_pretrained("SE6446/Untitled7-colab_checkpoint") |
|
|
|
################################################################# |
|
# Use a pipeline as a high-level helper |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("image-to-text", model="SE6446/Untitled7-colab_checkpoint") |
|
``` |
|
|
|
## Out-of-scope use |
|
|
|
Don't use this model to discriminate, alienate or in any other way harm/harass individuals. You guys know the drill... |
|
|
|
## Bias, Risks and, Limitations |
|
|
|
This model does not produce accurate prompts, this is merely a bit of fun (and waste of funds). However it can suffer from bias present in the orginal git-large-coco model. |
|
|
|
## Training |
|
*I.e boring stuff* |
|
|
|
- lr = 5e-5 |
|
- epochs = 150 |
|
- optim = adamw |
|
- fp16 |
|
|
|
If you want to further finetune it then you should freeze the embedding and vision tranformer layers |
|
|