laurentiubp
commited on
Commit
•
42f4d27
1
Parent(s):
8f2cdcf
Update README.md
Browse files
README.md
CHANGED
@@ -101,7 +101,6 @@ The model was trained **with the same prompt template of Llama-3 Instruct**.
|
|
101 |
|
102 |
The model was trained for two epochs on **8x A100 80GB GPUs using DeepSpeed ZeRO** State-3 without CPU offloading.
|
103 |
|
104 |
-
Then training lasted approximately 8 hours for a total GPU cost of 150€.
|
105 |
|
106 |
### Training hyperparameters
|
107 |
|
|
|
101 |
|
102 |
The model was trained for two epochs on **8x A100 80GB GPUs using DeepSpeed ZeRO** State-3 without CPU offloading.
|
103 |
|
|
|
104 |
|
105 |
### Training hyperparameters
|
106 |
|