|
--- |
|
license: mit |
|
datasets: |
|
- yahma/alpaca-cleaned |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# phi-2-alpaca-cleaned |
|
This model is an instruction-tuned version of the instruction-tuned [microsoft/phi-2](https://huggingface.co./microsoft/phi-2) on the [yahma/alpaca-cleaned](https://huggingface.co./datasets/yahma/alpaca-cleaned) dataset. |
|
|
|
## Training |
|
- GPUs: 8 × A6000 48GB |
|
- per_device_train_batch_size 8 |
|
- gradient_accumulation_steps 8 |
|
- per_device_eval_batch_size 8 |
|
- num_train_epochs 3 |
|
- learning_rate 2e-5 |
|
- warmup_ratio 0.03 |
|
|