Files changed (1) hide show
  1. README.md +55 -0
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - zalando-datasets/fashion_mnist
4
+ language:
5
+ - en
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: image-classification
9
+ tags:
10
+ - fashion
11
+ - clothes
12
+ - fashion_mnist
13
+ - CNN
14
+ - Classification
15
+ ---
16
+ # BeitForImageClassification
17
+
18
+ ## Model Structure
19
+
20
+ ### BeitModel
21
+ - **Embeddings: BeitEmbeddings**
22
+ - Uses patch embeddings with a `Conv2d` layer (3 input channels, 768 output channels, kernel size 16x16, stride 16x16).
23
+ - Includes a dropout layer with probability 0.0.
24
+
25
+ - **Encoder: BeitEncoder**
26
+ - Contains 12 `BeitLayer` modules.
27
+ - Each `BeitLayer` includes:
28
+ - **Attention: BeitAttention**
29
+ - `BeitSelfAttention` with linear layers for query, key, and value, dropout, and relative position bias.
30
+ - `BeitSelfOutput` with a linear layer and dropout.
31
+ - **Intermediate: BeitIntermediate**
32
+ - Dense layer increasing dimensions from 768 to 3072, followed by GELU activation.
33
+ - **Output: BeitOutput**
34
+ - Dense layer reducing dimensions back to 768, with dropout.
35
+ - **LayerNorm** applied before and after main operations.
36
+ - **Drop Path** mechanism with varying probability across layers.
37
+
38
+ - **Pooler: BeitPooler**
39
+ - Contains a layer normalization.
40
+
41
+ ### Classifier: Linear
42
+ - Linear layer mapping 768-dimensional embeddings to 10 output classes.
43
+
44
+ ## Detected Classes
45
+ The model has been trained to detect the following classes:
46
+ 1. T-shirt / top
47
+ 2. Trouser
48
+ 3. Pullover
49
+ 4. Dress
50
+ 5. Coat
51
+ 6. Sandal
52
+ 7. Shirt
53
+ 8. Sneaker
54
+ 9. Bag
55
+ 10. Ankle boot