Update README.md
Browse files
README.md
CHANGED
@@ -8,52 +8,52 @@ language:
|
|
8 |
pipeline_tag: image-to-text
|
9 |
---
|
10 |
|
11 |
-
# Model Card for Model ID
|
12 |
-
|
13 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
-
|
15 |
-
|
16 |
|
17 |
## Model Details
|
18 |
|
19 |
### Model Description
|
20 |
|
21 |
-
|
22 |
-
|
23 |
-
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
24 |
-
|
25 |
-
- **Developed by:** [More Information Needed]
|
26 |
-
- **Shared by [optional]:** [Mit]
|
27 |
- **Finetuned from model [optional]:** [microsoft/kosmos-2-patch14-224]
|
28 |
|
29 |
-
|
30 |
-
|
31 |
-
<!-- Provide the basic links for the model. -->
|
32 |
-
|
33 |
-
- **Repository:** [More Information Needed]
|
34 |
-
- **Paper [optional]:** [More Information Needed]
|
35 |
-
- **Demo [optional]:** [More Information Needed]
|
36 |
-
|
37 |
|
|
|
|
|
38 |
|
39 |
-
|
|
|
40 |
|
41 |
-
|
|
|
|
|
|
|
42 |
|
43 |
-
|
|
|
44 |
|
45 |
-
|
|
|
|
|
|
|
|
|
46 |
|
47 |
-
|
48 |
|
49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
[More Information Needed]
|
56 |
|
57 |
-
|
58 |
|
59 |
-
|
|
|
|
|
|
8 |
pipeline_tag: image-to-text
|
9 |
---
|
10 |
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
## Model Details
|
13 |
|
14 |
### Model Description
|
15 |
|
16 |
+
- **Developed by:** [https://huggingface.co/Mit1208]
|
|
|
|
|
|
|
|
|
|
|
17 |
- **Finetuned from model [optional]:** [microsoft/kosmos-2-patch14-224]
|
18 |
|
19 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
+
## Training Details
|
22 |
+
https://github.com/mit1280/fined-tuning/blob/main/Kosmos_2_fine_tune_PokemonCards_trl.ipynb
|
23 |
|
24 |
+
## Inference Details
|
25 |
+
https://github.com/mit1280/fined-tuning/blob/main/kosmos2_fine_tuned_inference.ipynb
|
26 |
|
27 |
+
### How to Use
|
28 |
+
```python
|
29 |
+
# Load model directly
|
30 |
+
from transformers import AutoProcessor, Kosmos2ForConditionalGeneration
|
31 |
|
32 |
+
# processor = AutoProcessor.from_pretrained("Mit1208/Kosmos-2-PokemonCards-trl-merged")
|
33 |
+
my_model = Kosmos2ForConditionalGeneration.from_pretrained("Mit1208/Kosmos-2-PokemonCards-trl-merged", device_map="auto",low_cpu_mem_usage=True)
|
34 |
|
35 |
+
# load image
|
36 |
+
image_url = "https://images.pokemontcg.io/sm9/24_hires.png"
|
37 |
+
response = requests.get(image_url)
|
38 |
+
# Read the image from the response content
|
39 |
+
image = Image.open(BytesIO(response.content))
|
40 |
|
41 |
+
prompt = "Pokemon name is"
|
42 |
|
43 |
+
inputs = processor(text=prompt, images=image, return_tensors="pt").to("cuda:0")
|
44 |
+
with torch.no_grad():
|
45 |
+
# autoregressively generate completion
|
46 |
+
generated_ids = my_model.generate(**inputs, max_new_tokens=30,)
|
47 |
+
# convert generated token IDs back to strings
|
48 |
+
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
49 |
+
print(generated_text.split("</image>")[-1].split(" and")[0] + ".")
|
50 |
|
51 |
+
'''
|
52 |
+
Output: Pokemon name is Wartortle.
|
53 |
+
'''
|
|
|
|
|
54 |
|
55 |
+
```
|
56 |
|
57 |
+
### Limitation
|
58 |
+
This model was fine-tuned using free colab version so only used 300 samples in training for **85** epochs.
|
59 |
+
Model is hallucinating very frequently so need to do post-processing. Another approach to handle this issue is update training data - use conversation data *and/or* update tokenizer padding token to tokenizer eos token.
|