thilinadj
/

image_classification_tdj_fashion-mnist

Image Classification

Inference Endpoints

Model card Files Files and versions Community

Create README.md

#2

by goldpotatoes - opened Jun 9, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+datasets:
+- zalando-datasets/fashion_mnist
+language:
+- en
+metrics:
+- accuracy
+pipeline_tag: image-classification
+tags:
+- fashion
+- clothes
+- fashion_mnist
+- CNN
+- Classification
+---
+# BeitForImageClassification
+## Model Structure
+### BeitModel
+- **Embeddings: BeitEmbeddings**
+  - Uses patch embeddings with a `Conv2d` layer (3 input channels, 768 output channels, kernel size 16x16, stride 16x16).
+  - Includes a dropout layer with probability 0.0.
+- **Encoder: BeitEncoder**
+  - Contains 12 `BeitLayer` modules.
+  - Each `BeitLayer` includes:
+    - **Attention: BeitAttention**
+      - `BeitSelfAttention` with linear layers for query, key, and value, dropout, and relative position bias.
+      - `BeitSelfOutput` with a linear layer and dropout.
+    - **Intermediate: BeitIntermediate**
+      - Dense layer increasing dimensions from 768 to 3072, followed by GELU activation.
+    - **Output: BeitOutput**
+      - Dense layer reducing dimensions back to 768, with dropout.
+    - **LayerNorm** applied before and after main operations.
+    - **Drop Path** mechanism with varying probability across layers.
+- **Pooler: BeitPooler**
+  - Contains a layer normalization.
+### Classifier: Linear
+- Linear layer mapping 768-dimensional embeddings to 10 output classes.
+## Detected Classes
+The model has been trained to detect the following classes:
+1. T-shirt / top
+2. Trouser
+3. Pullover
+4. Dress
+5. Coat
+6. Sandal
+7. Shirt
+8. Sneaker
+9. Bag
+10. Ankle boot