aoxo
/

Image-to-Image
English
art
aoxo commited on
Commit
b1630fd
1 Parent(s): 2aef983

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -45
README.md CHANGED
@@ -125,54 +125,54 @@ Images and their corresponding style semantic maps were resized to **512 x 512**
125
 
126
  #### Training Hyperparameters
127
 
128
- **v1**
129
- - **Precision**:fp32
130
- - **Embedded dimensions**: 768
131
- - **Hidden dimensions**: 3072
132
- - **Attention Type**: Linear Attention
133
- - **Number of attention heads**: 16
134
- - **Number of attention layers**: 8
135
- - **Number of transformer encoder layers (feed-forward)**: 8
136
- - **Number of transformer decoder layers (feed-forward)**: 8
137
- - **Activation function(s)**: ReLU, GeLU
138
- - **Patch Size**: 8
139
- - **Swin Window Size**: 7
140
- - **Swin Shift Size**: 2
141
- - **Style Transfer Module**: AdaIN (Adaptive Instance Normalization)
142
-
143
- **v2**
144
- - **Precision**: fp32
145
- - **Embedded dimensions**: 768
146
- - **Hidden dimensions**: 3072
147
- - **Attention Type**: Location-Based Multi-Head Attention (Linear Attention)
148
- - **Number of attention heads**: 16
149
- - **Number of attention layers**: 8
150
- - **Number of transformer encoder layers (feed-forward)**: 8
151
- - **Number of transformer decoder layers (feed-forward)**: 8
152
- - **Activation function(s)**: ReLU, GELU
153
- - **Patch Size**: 16
154
- - **Swin Window Size**: 7
155
- - **Swin Shift Size**: 2
156
- - **Style Transfer Module**: AdaIN
157
-
158
- **v3**
159
- - **Precision:** FP32, FP16, BF16, INT8
160
- - **Embedding Dimensions:** 768
161
- - **Hidden Dimensions:** 3072
162
- - **Attention Type:** Location-Based Multi-Head Attention (Linear Attention)
163
- - **Number of Attention Heads:** 42
164
- - **Number of Attention Layers:** 16
165
- - **Number of Transformer Encoder Layers (Feed-Forward):** 16
166
- - **Number of Transformer Decoder Layers (Feed-Forward):** 16
167
- - **Activation Functions:** ReLU, GeLU
168
- - **Patch Size:** 8
169
- - **Swin Window Size:** 7
170
- - **Swin Shift Size:** 2
171
- - **Style Transfer Module:** Style Adaptive Layer Normalization (SALN)
172
 
173
  #### Speeds, Sizes, Times
174
 
175
- **Model size:** There are currently four versions of the model:
176
  - v1_1: 224M params
177
  - v1_2: 200M params
178
  - v1_3: 93M params
 
125
 
126
  #### Training Hyperparameters
127
 
128
+ **v1**
129
+ - Precision: fp32
130
+ - Embedded dimensions: 768
131
+ - Hidden dimensions: 3072
132
+ - Attention Type: Linear Attention
133
+ - Number of attention heads: 16
134
+ - Number of attention layers: 8
135
+ - Number of transformer encoder layers (feed-forward): 8
136
+ - Number of transformer decoder layers (feed-forward): 8
137
+ - Activation function(s): ReLU, GeLU
138
+ - Patch Size: 8
139
+ - Swin Window Size: 7
140
+ - Swin Shift Size: 2
141
+ - Style Transfer Module: AdaIN (Adaptive Instance Normalization)
142
+
143
+ **v2**
144
+ - Precision: fp32
145
+ - Embedded dimensions: 768
146
+ - Hidden dimensions: 3072
147
+ - Attention Type: Location-Based Multi-Head Attention (Linear Attention)
148
+ - Number of attention heads: 16
149
+ - Number of attention layers: 8
150
+ - Number of transformer encoder layers (feed-forward): 8
151
+ - Number of transformer decoder layers (feed-forward): 8
152
+ - Activation function(s): ReLU, GELU
153
+ - Patch Size: 16
154
+ - Swin Window Size: 7
155
+ - Swin Shift Size: 2
156
+ - Style Transfer Module: AdaIN
157
+
158
+ **v3**
159
+ - Precision: FP32, FP16, BF16, INT8
160
+ - Embedding Dimensions: 768
161
+ - Hidden Dimensions: 3072
162
+ - Attention Type: Location-Based Multi-Head Attention (Linear Attention)
163
+ - Number of Attention Heads: 42
164
+ - Number of Attention Layers: 16
165
+ - Number of Transformer Encoder Layers (Feed-Forward): 16
166
+ - Number of Transformer Decoder Layers (Feed-Forward): 16
167
+ - Activation Functions: ReLU, GeLU
168
+ - Patch Size: 8
169
+ - Swin Window Size: 7
170
+ - Swin Shift Size: 2
171
+ - Style Transfer Module: Style Adaptive Layer Normalization (SALN)
172
 
173
  #### Speeds, Sizes, Times
174
 
175
+ **Model size:** There are currently five versions of the model:
176
  - v1_1: 224M params
177
  - v1_2: 200M params
178
  - v1_3: 93M params