File size: 3,767 Bytes

---
license: apache-2.0
base_model: distilbert/distilgpt2
tags:
- generated_from_trainer
model-index:
- name: tiny-gpt2-br
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# tiny-gpt2-br

This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co./distilbert/distilgpt2) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 3.2128

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0007
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 5.8959        | 0.1   | 1000  | 4.8993          |
| 4.6543        | 0.2   | 2000  | 4.4073          |
| 4.329         | 0.31  | 3000  | 4.1635          |
| 4.1446        | 0.41  | 4000  | 4.0202          |
| 4.0133        | 0.51  | 5000  | 3.9119          |
| 3.9236        | 0.61  | 6000  | 3.8271          |
| 3.8622        | 0.72  | 7000  | 3.7583          |
| 3.7928        | 0.82  | 8000  | 3.7028          |
| 3.7379        | 0.92  | 9000  | 3.6607          |
| 3.672         | 1.02  | 10000 | 3.6198          |
| 3.5527        | 1.12  | 11000 | 3.5873          |
| 3.5428        | 1.23  | 12000 | 3.5617          |
| 3.514         | 1.33  | 13000 | 3.5328          |
| 3.4959        | 1.43  | 14000 | 3.4995          |
| 3.4762        | 1.53  | 15000 | 3.4816          |
| 3.4621        | 1.63  | 16000 | 3.4536          |
| 3.4392        | 1.74  | 17000 | 3.4368          |
| 3.4149        | 1.84  | 18000 | 3.4150          |
| 3.4006        | 1.94  | 19000 | 3.3950          |
| 3.3313        | 2.04  | 20000 | 3.3951          |
| 3.228         | 2.15  | 21000 | 3.3820          |
| 3.223         | 2.25  | 22000 | 3.3694          |
| 3.2234        | 2.35  | 23000 | 3.3470          |
| 3.215         | 2.45  | 24000 | 3.3350          |
| 3.2037        | 2.55  | 25000 | 3.3257          |
| 3.2265        | 2.66  | 26000 | 3.3122          |
| 3.2012        | 2.76  | 27000 | 3.2943          |
| 3.1827        | 2.86  | 28000 | 3.2816          |
| 3.1801        | 2.96  | 29000 | 3.2706          |
| 3.0519        | 3.06  | 30000 | 3.2998          |
| 3.0003        | 3.17  | 31000 | 3.2847          |
| 3.0091        | 3.27  | 32000 | 3.2764          |
| 3.0007        | 3.37  | 33000 | 3.2682          |
| 3.0013        | 3.47  | 34000 | 3.2586          |
| 2.9951        | 3.58  | 35000 | 3.2452          |
| 2.9943        | 3.68  | 36000 | 3.2452          |
| 2.9941        | 3.78  | 37000 | 3.2311          |
| 2.9839        | 3.88  | 38000 | 3.2174          |
| 2.9861        | 3.98  | 39000 | 3.2149          |
| 2.8311        | 4.09  | 40000 | 3.2509          |
| 2.8113        | 4.19  | 41000 | 3.2432          |
| 2.8074        | 4.29  | 42000 | 3.2450          |
| 2.8123        | 4.39  | 43000 | 3.2359          |
| 2.8086        | 4.5   | 44000 | 3.2245          |
| 2.8028        | 4.6   | 45000 | 3.2261          |
| 2.8046        | 4.7   | 46000 | 3.2204          |
| 2.7978        | 4.8   | 47000 | 3.2148          |
| 2.7982        | 4.9   | 48000 | 3.2128          |


### Framework versions

- Transformers 4.39.1
- Pytorch 2.0.1+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2