togethercomputer
/

RedPajama-INCITE-Base-3B-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

danfu09 commited on May 5, 2023

Commit

c852771

•

1 Parent(s): 8cf442c

Add training details

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -119,11 +119,12 @@ Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/data
 **Training Procedure**
-- **Hardware:** TODO @Dan
-- **Optimizer:**
-- **Gradient Accumulations**:
 - **Num of Tokens:** 800B Tokens
-- **Learning rate:**
 ## Community

 **Training Procedure**
+- **Hardware:** 256 nodes of 6xV100 (IBM Power9), on the OLCF Summit cluster
+- **Optimizer:** Apex FusedAdam
+- **Parallelism:** Pipeline parallel 6, tensor parallel 2
+- **Gradient Accumulations**: 8 (global batch size 4M tokens)
 - **Num of Tokens:** 800B Tokens
+- **Learning rate:** 0.00016
 ## Community