|
## bert-base-uncased finetuned on IMDB dataset |
|
|
|
Evaluation set was created by taking 1000 samples from test set |
|
|
|
``` |
|
DatasetDict({ |
|
train: Dataset({ |
|
features: ['text', 'label'], |
|
num_rows: 25000 |
|
}) |
|
dev: Dataset({ |
|
features: ['text', 'label'], |
|
num_rows: 1000 |
|
}) |
|
test: Dataset({ |
|
features: ['text', 'label'], |
|
num_rows: 24000 |
|
}) |
|
}) |
|
``` |
|
|
|
## Parameters |
|
``` |
|
max_sequence_length = 128 |
|
batch_size = 32 |
|
eval_steps = 100 |
|
learning_rate=2e-05 |
|
num_train_epochs=5 |
|
early_stopping_patience = 10 |
|
``` |
|
|
|
## Training Run |
|
``` |
|
[2700/3910 1:11:43 < 32:09, 0.63 it/s, Epoch 3/5] |
|
Step Training Loss Validation Loss Accuracy Precision Recall F1 Runtime Samples Per Second |
|
100 No log 0.371974 0.845000 0.798942 0.917004 0.853911 15.256900 65.544000 |
|
200 No log 0.349631 0.850000 0.873913 0.813765 0.842767 15.288600 65.408000 |
|
300 No log 0.359376 0.845000 0.869281 0.807692 0.837356 15.303900 65.343000 |
|
400 No log 0.307613 0.870000 0.851351 0.892713 0.871542 15.358400 65.111000 |
|
500 0.364500 0.309362 0.856000 0.807018 0.931174 0.864662 15.326100 65.248000 |
|
600 0.364500 0.302709 0.867000 0.881607 0.844130 0.862461 15.324400 65.255000 |
|
700 0.364500 0.300102 0.871000 0.894168 0.838057 0.865204 15.474900 64.621000 |
|
800 0.364500 0.383784 0.866000 0.833333 0.910931 0.870406 15.380100 65.019000 |
|
900 0.364500 0.309934 0.874000 0.881743 0.860324 0.870902 15.358900 65.109000 |
|
1000 0.254600 0.332236 0.872000 0.894397 0.840081 0.866388 15.442700 64.756000 |
|
1100 0.254600 0.330807 0.871000 0.877847 0.858300 0.867963 15.410900 64.889000 |
|
1200 0.254600 0.352724 0.872000 0.925581 0.805668 0.861472 15.272800 65.476000 |
|
1300 0.254600 0.278529 0.881000 0.891441 0.864372 0.877698 15.408200 64.900000 |
|
1400 0.254600 0.291371 0.878000 0.854962 0.906883 0.880157 15.427400 64.820000 |
|
1500 0.208400 0.324827 0.869000 0.904232 0.821862 0.861082 15.338600 65.195000 |
|
1600 0.208400 0.377024 0.884000 0.898734 0.862348 0.880165 15.414500 64.874000 |
|
1700 0.208400 0.375274 0.885000 0.881288 0.886640 0.883956 15.367200 65.073000 |
|
1800 0.208400 0.378904 0.880000 0.877016 0.880567 0.878788 15.363900 65.088000 |
|
1900 0.208400 0.410517 0.874000 0.866534 0.880567 0.873494 15.324700 65.254000 |
|
2000 0.130800 0.404030 0.876000 0.888655 0.856275 0.872165 15.414200 64.875000 |
|
2100 0.130800 0.390763 0.883000 0.882353 0.880567 0.881459 15.341500 65.183000 |
|
2200 0.130800 0.417967 0.880000 0.875502 0.882591 0.879032 15.351300 65.141000 |
|
2300 0.130800 0.390974 0.883000 0.898520 0.860324 0.879007 15.396100 64.952000 |
|
2400 0.130800 0.479739 0.874000 0.856589 0.894737 0.875248 15.460500 64.681000 |
|
2500 0.098400 0.473215 0.875000 0.883576 0.860324 0.871795 15.392200 64.968000 |
|
2600 0.098400 0.532294 0.872000 0.889362 0.846154 0.867220 15.364100 65.087000 |
|
2700 0.098400 0.536664 0.881000 0.880325 0.878543 0.879433 15.351100 65.142000 |
|
|
|
TrainOutput(global_step=2700, training_loss=0.2004435383832013, metrics={'train_runtime': 4304.5331, 'train_samples_per_second': 0.908, 'total_flos': 7258763970957312, 'epoch': 3.45}) |
|
``` |
|
|
|
## Classification Report |
|
``` |
|
precision recall f1-score support |
|
|
|
0 0.90 0.87 0.89 11994 |
|
1 0.87 0.90 0.89 12006 |
|
|
|
accuracy 0.89 24000 |
|
macro avg 0.89 0.89 0.89 24000 |
|
weighted avg 0.89 0.89 0.89 24000 |
|
``` |
|
|