deepseek-coder-6.7b-instruct_En__translations_size_104_epochs_10_2024-06-22_06-06-55_3557999
This model is a fine-tuned version of deepseek-ai/deepseek-coder-6.7b-instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.1144
- Accuracy: 0.047
- Chrf: 0.5
- Bleu: 0.39
- Sacrebleu: 0.4
- Rouge1: 0.501
- Rouge2: 0.239
- Rougel: 0.448
- Rougelsum: 0.497
- Meteor: 0.413
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 3407
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 4
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 104
- training_steps: 1040
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Chrf | Bleu | Sacrebleu | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1338 | 4.0 | 104 | 1.1433 | 0.033 | 0.731 | 0.594 | 0.6 | 0.675 | 0.459 | 0.617 | 0.668 | 0.515 |
0.1034 | 8.0 | 208 | 1.2599 | 0.03 | 0.723 | 0.594 | 0.6 | 0.687 | 0.462 | 0.611 | 0.674 | 0.495 |
0.0974 | 12.0 | 312 | 1.3238 | 0.037 | 0.739 | 0.63 | 0.6 | 0.719 | 0.505 | 0.638 | 0.7 | 0.545 |
0.6468 | 16.0 | 416 | 1.9216 | 0.042 | 0.664 | 0.547 | 0.5 | 0.652 | 0.409 | 0.567 | 0.64 | 0.458 |
0.1566 | 20.0 | 520 | 2.1866 | 0.045 | 0.656 | 0.538 | 0.5 | 0.657 | 0.428 | 0.581 | 0.649 | 0.494 |
0.2056 | 24.0 | 624 | 2.5536 | 0.042 | 0.585 | 0.468 | 0.5 | 0.586 | 0.334 | 0.527 | 0.58 | 0.441 |
0.9677 | 28.0 | 728 | 2.8086 | 0.051 | 0.561 | 0.455 | 0.5 | 0.565 | 0.314 | 0.509 | 0.559 | 0.417 |
0.2163 | 32.0 | 832 | 2.9769 | 0.044 | 0.532 | 0.42 | 0.4 | 0.523 | 0.26 | 0.475 | 0.517 | 0.412 |
0.3617 | 36.0 | 936 | 3.1040 | 0.052 | 0.502 | 0.392 | 0.4 | 0.496 | 0.239 | 0.447 | 0.492 | 0.427 |
0.2153 | 40.0 | 1040 | 3.1144 | 0.047 | 0.5 | 0.39 | 0.4 | 0.501 | 0.239 | 0.448 | 0.497 | 0.413 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.2.1+cu121
- Datasets 2.20.0
- Tokenizers 0.15.2
Model tree for vdavidr/deepseek-coder-6.7b-instruct_En__translations_size_104_epochs_10_2024-06-22_06-06-55_3557999
Base model
deepseek-ai/deepseek-coder-6.7b-instruct