---
tags:
- generated_from_trainer
model-index:
- name: myBit-Llama2-jp-127M-3
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# myBit-Llama2-jp-127M-3

This model is a fine-tuned version of [](https://huggingface.co./) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 13.0221

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- num_epochs: 100

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 7.8184        | 1.25  | 10   | 8.3355          |
| 5.4327        | 2.5   | 20   | 7.6000          |
| 5.0861        | 3.75  | 30   | 7.8126          |
| 4.7586        | 5.0   | 40   | 7.5748          |
| 4.4392        | 6.25  | 50   | 7.4509          |
| 4.1938        | 7.5   | 60   | 7.3834          |
| 4.0095        | 8.75  | 70   | 7.2750          |
| 3.905         | 10.0  | 80   | 7.3800          |
| 3.6536        | 11.25 | 90   | 7.4560          |
| 3.3187        | 12.5  | 100  | 7.6310          |
| 3.3315        | 13.75 | 110  | 8.0397          |
| 2.9308        | 15.0  | 120  | 8.3902          |
| 2.679         | 16.25 | 130  | 9.0364          |
| 2.2896        | 17.5  | 140  | 9.8766          |
| 1.8407        | 18.75 | 150  | 10.7682         |
| 1.5081        | 20.0  | 160  | 11.7175         |
| 0.9778        | 21.25 | 170  | 12.8239         |
| 0.6572        | 22.5  | 180  | 13.6506         |
| 0.5411        | 23.75 | 190  | 14.2579         |
| 0.44          | 25.0  | 200  | 14.5732         |
| 0.3283        | 26.25 | 210  | 15.1087         |
| 0.2507        | 27.5  | 220  | 15.0569         |
| 0.2044        | 28.75 | 230  | 15.1893         |
| 0.1838        | 30.0  | 240  | 15.6291         |
| 0.1626        | 31.25 | 250  | 15.4617         |
| 0.1124        | 32.5  | 260  | 15.2738         |
| 0.1011        | 33.75 | 270  | 15.2130         |
| 0.0845        | 35.0  | 280  | 15.2749         |
| 0.0852        | 36.25 | 290  | 15.3292         |
| 0.1025        | 37.5  | 300  | 15.1574         |
| 0.1075        | 38.75 | 310  | 15.1100         |
| 0.079         | 40.0  | 320  | 14.8177         |
| 0.0857        | 41.25 | 330  | 14.8609         |
| 0.0629        | 42.5  | 340  | 14.6443         |
| 0.0713        | 43.75 | 350  | 14.5514         |
| 0.0594        | 45.0  | 360  | 14.6032         |
| 0.0557        | 46.25 | 370  | 14.3489         |
| 0.0554        | 47.5  | 380  | 14.3289         |
| 0.0548        | 48.75 | 390  | 14.1991         |
| 0.0528        | 50.0  | 400  | 14.1350         |
| 0.0515        | 51.25 | 410  | 13.9952         |
| 0.0529        | 52.5  | 420  | 13.9788         |
| 0.0516        | 53.75 | 430  | 13.9438         |
| 0.0506        | 55.0  | 440  | 13.8746         |
| 0.049         | 56.25 | 450  | 13.7564         |
| 0.0491        | 57.5  | 460  | 13.7900         |
| 0.0493        | 58.75 | 470  | 13.6992         |
| 0.0491        | 60.0  | 480  | 13.6421         |
| 0.0497        | 61.25 | 490  | 13.6419         |
| 0.0489        | 62.5  | 500  | 13.5448         |
| 0.0504        | 63.75 | 510  | 13.5048         |
| 0.0508        | 65.0  | 520  | 13.5077         |
| 0.0488        | 66.25 | 530  | 13.5045         |
| 0.0485        | 67.5  | 540  | 13.4404         |
| 0.0493        | 68.75 | 550  | 13.4167         |
| 0.0507        | 70.0  | 560  | 13.3758         |
| 0.0491        | 71.25 | 570  | 13.3239         |
| 0.0484        | 72.5  | 580  | 13.3139         |
| 0.0472        | 73.75 | 590  | 13.2933         |
| 0.0493        | 75.0  | 600  | 13.3105         |
| 0.0475        | 76.25 | 610  | 13.2306         |
| 0.0465        | 77.5  | 620  | 13.2378         |
| 0.0474        | 78.75 | 630  | 13.2074         |
| 0.0468        | 80.0  | 640  | 13.1871         |
| 0.0466        | 81.25 | 650  | 13.2055         |
| 0.0459        | 82.5  | 660  | 13.1327         |
| 0.0466        | 83.75 | 670  | 13.1801         |
| 0.0485        | 85.0  | 680  | 13.1610         |
| 0.046         | 86.25 | 690  | 13.1439         |
| 0.0467        | 87.5  | 700  | 13.1114         |
| 0.0455        | 88.75 | 710  | 13.1123         |
| 0.0456        | 90.0  | 720  | 13.0635         |
| 0.0447        | 91.25 | 730  | 13.0997         |
| 0.0449        | 92.5  | 740  | 13.0704         |
| 0.0453        | 93.75 | 750  | 13.0531         |
| 0.0451        | 95.0  | 760  | 13.0432         |
| 0.0442        | 96.25 | 770  | 13.0311         |
| 0.0444        | 97.5  | 780  | 13.0329         |
| 0.0432        | 98.75 | 790  | 13.0491         |
| 0.0442        | 100.0 | 800  | 13.0221         |


### Framework versions

- Transformers 4.39.1
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2