|
--- |
|
datasets: |
|
- deepghs/character_similarity |
|
- deepghs/character_index |
|
metrics: |
|
- f1 |
|
- adjust_random_score |
|
language: |
|
- en |
|
- ja |
|
- zh |
|
pipeline_tag: zero-shot-image-classification |
|
library_name: dghs-imgutils |
|
tags: |
|
- art |
|
- anime |
|
- character |
|
license: openrail |
|
--- |
|
|
|
# CCIP |
|
CCIP(Contrastive Anime Character Image Pre-Training) is a model to calculuate the visual similarity between anime characters in two images. (limited to images containing only a single anime character). More similar the characters between two images are, higher score it should have. |
|
|
|
# Usage |
|
Using CCIP with [imgutils](https://dghs-imgutils.deepghs.org/main/tutorials/installation/index.html) |
|
|
|
![](https://dghs-imgutils.deepghs.org/main/_images/ccip_small.plot.py.svg) |
|
Calculuate character similarity between images: |
|
``` |
|
from imgutils.metrics import ccip_batch_differences |
|
|
|
ccip_batch_differences(['ccip/1.jpg', 'ccip/2.jpg', 'ccip/6.jpg', 'ccip/7.jpg']) |
|
array([[6.5350548e-08, 1.6583106e-01, 4.2947042e-01, 4.0375218e-01], |
|
[1.6583106e-01, 9.8025822e-08, 4.3715334e-01, 4.0748104e-01], |
|
[4.2947042e-01, 4.3715334e-01, 3.2675274e-08, 3.9229470e-01], |
|
[4.0375218e-01, 4.0748104e-01, 3.9229470e-01, 6.5350548e-08]], |
|
dtype=float32) |
|
``` |
|
|
|
[More detailed instruction](https://dghs-imgutils.deepghs.org/main/api_doc/metrics/ccip.html) |
|
|
|
# Performence |
|
| Model | F1 Score | Precision | Recall | Threshold | Cluster_2 | Cluster_Free | |
|
|:-----------------------------------:|:----------:|:-----------:|:--------:|:-----------:|:-----------:|:--------------:| |
|
| ccip-caformer_b36-24 | 0.940925 | 0.938254 | 0.943612 | 0.213231 | 0.89508 | 0.957017 | |
|
| ccip-caformer-24-randaug-pruned | 0.917211 | 0.933481 | 0.901499 | 0.178475 | 0.890366 | 0.922375 | |
|
| ccip-v2-caformer_s36-10 | 0.906422 | 0.932779 | 0.881513 | 0.207757 | 0.874592 | 0.89241 | |
|
| ccip-caformer-6-randaug-pruned_fp32 | 0.878403 | 0.893648 | 0.863669 | 0.195122 | 0.810176 | 0.897904 | |
|
| ccip-caformer-5_fp32 | 0.864363 | 0.90155 | 0.830121 | 0.183973 | 0.792051 | 0.862289 | |
|
| ccip-caformer-4_fp32 | 0.844967 | 0.870553 | 0.820842 | 0.18367 | 0.795565 | 0.868133 | |
|
| ccip-caformer_query-12 | 0.823928 | 0.871122 | 0.781585 | 0.141308 | 0.787237 | 0.809426 | |
|
| ccip-caformer-23_randaug_fp32 | 0.81625 | 0.854134 | 0.781585 | 0.136797 | 0.745697 | 0.8068 | |
|
| ccip-caformer-2-randaug-pruned_fp32 | 0.78561 | 0.800148 | 0.771592 | 0.171053 | 0.686617 | 0.728195 | |
|
| ccip-caformer-2_fp32 | 0.755125 | 0.790172 | 0.723055 | 0.141275 | 0.64977 | 0.718516 | |
|
|
|
* The calculation of `F1 Score`, `Precision`, and `Recall` considers "the characters in both images are the same" as a positive case. `Threshold` is determined by finding the maximum value on the F1 Score curve. |
|
* `Cluster_2` represents the approximate optimal clustering solution obtained by tuning the eps value in DBSCAN clustering algorithm with min_samples set to `2`, and evaluating the similarity between the obtained clusters and the true distribution using the `random_adjust_score`. |
|
* `Cluster_Free` represents the approximate optimal solution obtained by tuning the `max_eps` and `min_samples` values in the OPTICS clustering algorithm, and evaluating the similarity between the obtained clusters and the true distribution using the `random_adjust_score`. |
|
|
|
![operations benchmark](https://dghs-imgutils.deepghs.org/main/_images/ccip_benchmark.plot.py.svg) |
|
|
|
# Citation |
|
```bibtex |
|
@misc{CCIP, |
|
title={Contrastive Anime Character Image Pre-Training}, |
|
author={Ziyi Dong and narugo1992}, |
|
year={2024}, |
|
howpublished={\url{https://huggingface.co./deepghs/ccip}} |
|
} |
|
``` |