--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl base_model: unsloth/llama-3-8b-bnb-4bit datasets: - ruslandev/tagengo-subset-gpt-4o --- # Uploaded model - **Developed by:** ruslandev - **License:** apache-2.0 - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit This model is finetuned on [ruslandev/tagengo-subset-gpt-4o](https://huggingface.co./datasets/ruslandev/tagengo-subset-gpt-4o) dataset. Please note - this model has been created for educational purposes and it needs further training/fine tuning. # How to use I recommend using my framework [gptchain](https://github.com/RuslanPeresy/gptchain). ``` git clone https://github.com/RuslanPeresy/gptchain.git cd gptchain pip install -r requirements-train.txt python gptchain.py chat -m ruslandev/llama-3-8b-gpt-4o \ --chatml true \ -q '[{"from": "human", "value": "Из чего состоит нейронная сеть?"}]' ``` # Training [gptchain](https://github.com/RuslanPeresy/gptchain) framework has been used for training. ``` python gptchain.py train -m unsloth/llama-3-8b-bnb-4bit \ -dn tagengo_subset_gpt4o \ -sp checkpoints/llama-3-8b-gpt-4o \ -hf llama-3-8b-gpt-4o \ --num-epochs 3 ``` # Training hyperparameters - learning_rate: 2e-4 - seed: 3407 - gradient_accumulation_steps: 4 - per_device_train_batch_size: 2 - optimizer: adamw_8bit - lr_scheduler_type: linear - warmup_steps: 5 - num_train_epochs: 3 - weight_decay: 0.01 # Training results [wandb report](https://api.wandb.ai/links/ruslandev/2i1pukst) [](https://github.com/unslothai/unsloth)