Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,148 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
|
4 |
-
-
|
|
|
|
|
5 |
---
|
|
|
6 |
|
7 |
-
This model
|
8 |
-
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- imagenet1k
|
5 |
+
metrics:
|
6 |
+
- accuracy
|
7 |
---
|
8 |
+
# VGG-like Kolmogorov-Arnold Convolutional network with Gram polynomials
|
9 |
|
10 |
+
This model is a Convolutional version of Kolmogorov-Arnold Network with VGG-11 like architecture, pretrained on Imagenet1k dataset. KANs were originally presented in [1, 2]. Gram version of KAN originally presented in [3]. For more details visit our [torch-conv-kan](https://github.com/IvanDrokin/torch-conv-kan) repository on GitHub.
|
11 |
+
|
12 |
+
## Model description
|
13 |
+
|
14 |
+
The model consists of consecutive 10 Gram ConvKAN Layers with InstanceNorm2d, polynomial degree equal to 5, GlobalAveragePooling and Linear classification head:
|
15 |
+
|
16 |
+
1. BottleNeckKAGN Convolution, 32 filters, 3x3
|
17 |
+
2. Max pooling, 2x2
|
18 |
+
3. BottleNeckKAGN Convolution, 64 filters, 3x3
|
19 |
+
4. Max pooling, 2x2
|
20 |
+
5. BottleNeckKAGN Convolution, 128 filters, 3x3
|
21 |
+
6. BottleNeckKAGN Convolution, 128 filters, 3x3
|
22 |
+
7. Max pooling, 2x2
|
23 |
+
8. BottleNeckKAGN Convolution, 256 filters, 3x3
|
24 |
+
9. BottleNeckKAGN Convolution, 256 filters, 3x3
|
25 |
+
10 Max pooling, 2x2
|
26 |
+
11. BottleNeckKAGN Convolution, 256 filters, 3x3
|
27 |
+
12. BottleNeckKAGN Convolution, 256 filters, 3x3
|
28 |
+
13. Max pooling, 2x2
|
29 |
+
14. BottleNeckKAGN Convolution, 512 filters, 3x3
|
30 |
+
15. BottleNeckKAGN Convolution, 512 filters, 3x3
|
31 |
+
16. BottleNeckSelfKAGNtention, 512 filters, 3x3
|
32 |
+
17. Global Average pooling
|
33 |
+
18. Output layer, 1000 nodes.
|
34 |
+
|
35 |
+

|
36 |
+
|
37 |
+
|
38 |
+
## Intended uses & limitations
|
39 |
+
|
40 |
+
You can use the raw model for image classification or use it as pretrained model for further finetuning.
|
41 |
+
|
42 |
+
### How to use
|
43 |
+
|
44 |
+
First, clone the repository:
|
45 |
+
|
46 |
+
```
|
47 |
+
git clone https://github.com/IvanDrokin/torch-conv-kan.git
|
48 |
+
cd torch-conv-kan
|
49 |
+
pip install -r requirements.txt
|
50 |
+
```
|
51 |
+
Then you can initialize the model and load weights.
|
52 |
+
|
53 |
+
```python
|
54 |
+
import torch
|
55 |
+
from models import vggkagn
|
56 |
+
|
57 |
+
|
58 |
+
model = vggkagn_bn(
|
59 |
+
3,
|
60 |
+
1000,
|
61 |
+
groups=1,
|
62 |
+
degree=5,
|
63 |
+
dropout= 0.05,
|
64 |
+
l1_decay=0,
|
65 |
+
width_scale=2,
|
66 |
+
affine=True,
|
67 |
+
norm_layer=nn.BatchNorm2d,
|
68 |
+
expected_feature_shape=(1, 1),
|
69 |
+
vgg_type='VGG11v4',
|
70 |
+
last_attention=True,
|
71 |
+
sa_inner_projection=None
|
72 |
+
)
|
73 |
+
|
74 |
+
model.from_pretrained('brivangl/vgg_kagn_bn11sa_v4')
|
75 |
+
```
|
76 |
+
|
77 |
+
Transforms, used for validation on Imagenet1k:
|
78 |
+
|
79 |
+
```python
|
80 |
+
from torchvision.transforms import v2
|
81 |
+
|
82 |
+
|
83 |
+
transforms_val = v2.Compose([
|
84 |
+
v2.ToImage(),
|
85 |
+
v2.Resize(256, antialias=True),
|
86 |
+
v2.CenterCrop(224),
|
87 |
+
v2.ToDtype(torch.float32, scale=True),
|
88 |
+
v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
|
89 |
+
])
|
90 |
+
```
|
91 |
+
|
92 |
+
|
93 |
+
|
94 |
+
## Training data
|
95 |
+
This model trained on Imagenet1k dataset (1281167 images in train set)
|
96 |
+
|
97 |
+
## Training procedure
|
98 |
+
|
99 |
+
Model was trained during 200 full epochs with AdamW optimizer, with following parameters:
|
100 |
+
```python
|
101 |
+
{'learning_rate': 0.0009, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_weight_decay': 5e-06,
|
102 |
+
'adam_epsilon': 1e-08, 'lr_warmup_steps': 7500, 'lr_power': 0.3, 'lr_end': 1e-07, 'set_grads_to_none': False}
|
103 |
+
```
|
104 |
+
And this augmnetations:
|
105 |
+
```python
|
106 |
+
transforms_train = v2.Compose([
|
107 |
+
v2.ToImage(),
|
108 |
+
v2.RandomHorizontalFlip(p=0.5),
|
109 |
+
v2.RandomResizedCrop(224, antialias=True),
|
110 |
+
v2.RandomChoice([v2.AutoAugment(AutoAugmentPolicy.CIFAR10),
|
111 |
+
v2.AutoAugment(AutoAugmentPolicy.IMAGENET)
|
112 |
+
]),
|
113 |
+
v2.ToDtype(torch.float32, scale=True),
|
114 |
+
v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
|
115 |
+
])
|
116 |
+
```
|
117 |
+
|
118 |
+
## Evaluation results
|
119 |
+
|
120 |
+
On Imagenet1k Validation:
|
121 |
+
|
122 |
+
| Accuracy, top1 | Accuracy, top5 | AUC (ovo) | AUC (ovr) |
|
123 |
+
|:--------------:|:--------------:|:---------:|:---------:|
|
124 |
+
| 70.684 | 89.462 | 99.624 | 99.624 |
|
125 |
+
|
126 |
+
On Imagenet1k Test:
|
127 |
+
Coming soon
|
128 |
+
|
129 |
+
### BibTeX entry and citation info
|
130 |
+
|
131 |
+
If you use this project in your research or wish to refer to the baseline results, please use the following BibTeX entry.
|
132 |
+
|
133 |
+
```bibtex
|
134 |
+
@misc{torch-conv-kan,
|
135 |
+
author = {Ivan Drokin},
|
136 |
+
title = {Torch Conv KAN},
|
137 |
+
year = {2024},
|
138 |
+
publisher = {GitHub},
|
139 |
+
journal = {GitHub repository},
|
140 |
+
howpublished = {\url{https://github.com/IvanDrokin/torch-conv-kan}}
|
141 |
+
}
|
142 |
+
```
|
143 |
+
|
144 |
+
## References
|
145 |
+
|
146 |
+
- [1] Ziming Liu et al., "KAN: Kolmogorov-Arnold Networks", 2024, arXiv. https://arxiv.org/abs/2404.19756
|
147 |
+
- [2] https://github.com/KindXiaoming/pykan
|
148 |
+
- [3] https://github.com/Khochawongwat/GRAMKAN
|