project-baize
/

baize-lora-13B

Model card Files Files and versions Community

project-baize commited on Apr 3, 2023

Commit

181da30

•

1 Parent(s): 4e1fc0c

Update README.md

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -1,3 +1,35 @@
 ---
 license: cc-by-nc-4.0
 ---

 ---
 license: cc-by-nc-4.0
 ---
+---
+license: cc-by-nc-4.0
+---
+<p align="center">
+<img width="500px" alt="Project Baize" src="https://user-images.githubusercontent.com/22514219/229195563-0cddfa74-e52f-4413-b4b4-e4ba489c4b3d.png">
+</p>
+<hr>
+## What's Baize?
+Baize is an open-source chat model fine-tuned with [LoRA](https://github.com/microsoft/LoRA). It uses 100k dialogs generated by letting ChatGPT chat with itself. We also use Alpaca's data to improve its performance. This repo contains 13B model.
+## Why it's called Baize?
+Baize (白泽) is a mythical creature in Chinese folklore, who speaks human languages and knows everything. This is exactly what we expect from a chat model.
+## Training Parameters
+- Base Model: [LLaMA-13B](https://arxiv.org/pdf/2302.13971.pdf)
+- Training Epoch: 1
+- Batch Size: 64
+- Maximum Input Length: 512
+- Learning Rate: 1e-4
+- LoRA Rank: 8
+- Updated Modules: All Linears
+## Training Dataset
+- [Standford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) (51,942)
+- [Quora Dialogs](https://github.com/project-baize/baize) (54,456):
+- [StackOverflow Dialogs](https://github.com/project-baize/baize) (57,046)
+More details can be found in the Baize [GitHub]((https://github.com/project-baize/baize))