--- language: - en license: apache-2.0 tags: - generated_from_keras_callback datasets: - Babelscape/multinerd metrics: - seqeval base_model: distilbert-base-uncased pipeline_tag: token-classification widget: - text: After months of meticulous review and analysis, I am proud to present a study that explores the deep connections between Epstein-Barr virus (EBV), Long COVID and Myalgic Encephalomyelitis. example_title: Example 1 - text: The boy is, of course, Cupid. The image of a cupid riding a lion was a common theme in classical and Renaissance art, representing the Virgilian maxim Amor vincit omnia – love conquers all. example_title: Example 2 - text: Billionaire Charlie Munger, Warren Buffet's right hand man, dies at 99. example_title: Example 3 model-index: - name: i-be-snek/distilbert-base-uncased-finetuned-ner-exp_A results: - task: type: token-classification name: ner dataset: name: Babelscape/multinerd type: Babelscape/multinerd split: test metrics: - type: seqeval value: 0.9053582270795385 name: precision - type: seqeval value: 0.9303178007408852 name: recall - type: seqeval value: 0.9176683270188665 name: f1 - type: seqeval value: 0.9863554498955407 name: accuracy --- # i-be-snek/distilbert-base-uncased-finetuned-ner-exp_A This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co./distilbert-base-uncased) on the English subset of all named entities in [Babelscape/multinerd](https://huggingface.co./datasets/Babelscape/multinerd) dataset. It achieves the following results on the validation set: - Train Loss: 0.0163 - Validation Loss: 0.1024 - Train Precision: 0.8763 - Train Recall: 0.8862 - Train F1: 0.8812 - Train Accuracy: 0.9750 - Epoch: 2 ## Model description [distilbert-base-uncased-finetuned-ner-exp_A](https://huggingface.co./i-be-snek/distilbert-base-uncased-finetuned-ner-exp_B) is a Named Entity Recognition model finetuned on [distilbert-base-uncased](https://huggingface.co./distilbert-base-uncased). This model is uncased, so it makes no distinction between "sarah" and "Sarah". ## Training and evaluation data This model has been evaluated on the English subset of the test set of [Babelscape/multinerd](https://huggingface.co./datasets/Babelscape/multinerd) ### Evaluation results | metric | value | |:----------|---------:| | precision | 0.905358 | | recall | 0.930318 | | f1 | 0.917668 | | accuracy | 0.986355 | |metric/tag | ANIM | BIO | CEL | DIS | EVE | FOOD | INST | LOC | MEDIA | MYTH | ORG | PER | PLANT | TIME | VEHI | |:----------|------------:|----------:|----------:|------------:|-----------:|------------:|----------:|-------------:|-----------:|----------:|------------:|-------------:|------------:|-----------:|----------:| | precision | 0.667262 | 0.666667 | 0.508197 | 0.662324 | 0.896277 | 0.637809 | 0.642857 | 0.964137 | 0.931915 | 0.638889 | 0.941176 | 0.99033 | 0.558043 | 0.756579 | 0.735294 | | recall | 0.698878 | 0.75 | 0.756098 | 0.803689 | 0.957386 | 0.637809 | 0.75 | 0.963656 | 0.956332 | 0.71875 | 0.962224 | 0.992023 | 0.752796 | 0.795848 | 0.78125 | | f1 | 0.682704 | 0.705882 | 0.607843 | 0.72619 | 0.925824 | 0.637809 | 0.692308 | 0.963897 | 0.943966 | 0.676471 | 0.951584 | 0.991176 | 0.640952 | 0.775717 | 0.757576 | | number | 3208 | 16 | 82 | 1518 | 704 | 1132 | 24 | 24048 | 916 | 64 | 6618 | 10530 | 1788 | 578 | 64 | ## Training procedure All scripts for training can be found in this [GitHub repository](https://github.com/i-be-snek/rise-assignment-ner-finetune). The model had early stopped watching its `val_loss`. ### Training hyperparameters The following hyperparameters were used during training: - optimizer: ```python { "name": "AdamWeightDecay", "learning_rate": 2e-05, "decay": 0.0, "beta_1": 0.9, "beta_2": 0.999, "epsilon": 1e-07, "amsgrad": False, "weight_decay_rate": 0.0, } ``` - training_precision: `float32` ### Training results | Train Loss | Validation Loss | Train Precision | Train Recall | Train F1 | Train Accuracy | Epoch | |:----------:|:---------------:|:---------------:|:------------:|:--------:|:--------------:|:-----:| | 0.0709 | 0.0710 | 0.8563 | 0.8875 | 0.8716 | 0.9735 | 0 | | 0.0295 | 0.0851 | 0.8743 | 0.8835 | 0.8789 | 0.9748 | 1 | | 0.0163 | 0.1024 | 0.8763 | 0.8862 | 0.8812 | 0.9750 | 2 | Epoch 0 | Named Entity | precision | recall | f1 | |:----------:|:---------:|:---------:|:------:| | ANIM | 0.699150 | 0.620124 | 0.657270 | | BIO | 0.480000 | 0.782609 | 0.595041 | | CEL | 0.815385 | 0.876033 | 0.844622 | | DIS | 0.628939 | 0.806709 | 0.706818 | | EVE | 0.898876 | 0.924855 | 0.911681 | | FOOD | 0.624774 | 0.602266 | 0.613314 | | INST | 0.467391 | 0.741379 | 0.573333 | | LOC | 0.967354 | 0.969634 | 0.968493 | | MEDIA | 0.911227 | 0.939856 | 0.925320 | | MYTH | 0.941860 | 0.771429 | 0.848168 | | ORG | 0.924471 | 0.937629 | 0.931003 | | PER | 0.988699 | 0.990918 | 0.989807 | | PLANT | 0.622521 | 0.781333 | 0.692944 | | TIME | 0.743902 | 0.738499 | 0.741191 | | VEHI | 0.785714 | 0.791367 | 0.788530 | Epoch 1 | Named Entity | precision | recall | f1 | |:----------:|:---------:|:---------:|:--------:| | ANIM | 0.701040 | 0.747340 | 0.723450 | | BIO | 0.422222 | 0.826087 | 0.558824 | | CEL | 0.729167 | 0.867769 | 0.792453 | | DIS | 0.731099 | 0.749794 | 0.740328 | | EVE | 0.864865 | 0.924855 | 0.893855 | | FOOD | 0.652865 | 0.572632 | 0.610122 | | INST | 0.871795 | 0.586207 | 0.701031 | | LOC | 0.968255 | 0.966143 | 0.967198 | | MEDIA | 0.946346 | 0.918312 | 0.932118 | | MYTH | 0.914894 | 0.819048 | 0.864322 | | ORG | 0.906064 | 0.943582 | 0.924442 | | PER | 0.990389 | 0.988367 | 0.989377 | | PLANT | 0.625889 | 0.743556 | 0.679667 | | TIME | 0.755981 | 0.765133 | 0.760529 | | VEHI | 0.737500 | 0.848921 | 0.789298 | Epoch 2 | Named Entity | precision | recall | f1 | |:----------:|:---------:|:---------:|:--------:| | ANIM | 0.730443 | 0.687057 | 0.708086 | | BIO | 0.330882 | 0.978261 | 0.494505 | | CEL | 0.798561 | 0.917355 | 0.853846 | | DIS | 0.738108 | 0.750894 | 0.744446 | | EVE | 0.904899 | 0.907514 | 0.906205 | | FOOD | 0.628664 | 0.623184 | 0.625912 | | INST | 0.533333 | 0.551724 | 0.542373 | | LOC | 0.967915 | 0.973997 | 0.970946 | | MEDIA | 0.949627 | 0.913824 | 0.931382 | | MYTH | 0.910000 | 0.866667 | 0.887805 | | ORG | 0.924920 | 0.934136 | 0.929505 | | PER | 0.989506 | 0.991020 | 0.990263 | | PLANT | 0.637648 | 0.742222 | 0.685972 | | TIME | 0.766355 | 0.794189 | 0.780024 | | VEHI | 0.818182 | 0.647482 | 0.722892 | ### Framework versions - Transformers 4.35.2 - TensorFlow 2.14.0 - Datasets 2.15.0 - Tokenizers 0.15.0