license: apache-2.0 | |
## Model | |
base_model : beomi/OPEN-SOLAR-KO-10.7B | |
## Dataset | |
* 공개 데이터 수집 | |
* Deduplicating Training Data Makes Language Models Better 알고리즘 활용 | |
## Code | |
```python | |
### KO-Platypus | |
from transformers import AutoModelForCausalLM, AutoTokenizer | |
import torch | |
model_name = "jingyeom/SOLAR_KO_1.3_deup" | |
model = AutoModelForCausalLM.from_pretrained( | |
model_name, | |
) | |
tokenizer = AutoTokenizer.from_pretrained(model_name) | |
``` |