metadata

language:
  - es
license: apache-2.0
datasets:
  - eriktks/conll2002
metrics:
  - precision
  - recall
  - f1
  - accuracy
pipeline_tag: token-classification

Model Name: bert-finetuned-ner-1

This is a BERT model fine-tuned for Named Entity Recognition (NER).

Model Description

This is a fine-tuned BERT model for Named Entity Recognition (NER) task using CONLL2002 dataset.

In the first part, the dataset must be pre-processed in order to give it to the model. This is done using the 🤗 Transformers and BERT tokenizers. Once this is done, finetuning is applied from bert-base-cased and using the 🤗 AutoModelForTokenClassification.

Finally, the model is trained obtaining the neccesary metrics for evaluating its performance (Precision, Recall, F1 and Accuracy)

Summary of executed tests can be found in: https://docs.google.com/spreadsheets/d/1lI7skNIvRurwq3LA5ps7JFK5TxToEx4s7Kaah3ezyQc/edit?usp=sharing

Model can be found in: https://huggingface.co./paulrojasg/bert-finetuned-ner-1

Github repository: https://github.com/paulrojasg/nlp_4th_workshop

Training

Training Details

Epochs: 10
Learning Rate: 2e-05
Weight Decay: 0.01
Batch Size (Train): 16
Batch Size (Eval): 8

Training Metrics

Epoch	Training Loss	Validation Loss	Precision	Recall	F1 Score	Accuracy
1	0.1729	0.1462	0.6739	0.7376	0.7043	0.9590
2	0.0750	0.1432	0.7030	0.7684	0.7342	0.9625
3	0.0496	0.1394	0.7725	0.7983	0.7852	0.9667
4	0.0351	0.1460	0.7678	0.8017	0.7844	0.9672
5	0.0255	0.1521	0.7656	0.8081	0.7863	0.9678
6	0.0182	0.1709	0.7573	0.8015	0.7787	0.9667
7	0.0134	0.1753	0.7794	0.8159	0.7973	0.9691
8	0.0110	0.1806	0.7674	0.8125	0.7893	0.9685
9	0.0080	0.1938	0.7724	0.8139	0.7926	0.9683
10	0.0072	0.1938	0.7722	0.8139	0.7925	0.9685

Authors

Made by:

Paul Rodrigo Rojas Guerrero
Jose Luis Hincapie Bucheli
Sebastián Idrobo Avirama

With help from:

Raúl Ernesto Gutiérrez