Model Overview

A fine-tuned DistilBERT model for Named Entity Recognition (NER) in bias detection.

Model Details

We used distilbert-base-uncased and fine-tuned it on vector-institute/NMB-Plus-Named-Entities dataset.

How to Get Started with the Model

from transformers import AutoModelForTokenClassification, AutoTokenizer

model_name = "vector-institute/nmb-plus-bias-ner-bert"
tokenizer = AutoTokenizer.from_pretrained(model_name)

label_list = ["O", "B-BIAS", "I-BIAS"]
id2label = {i: label for i, label in enumerate(label_list)}
label2id = {label: i for i, label in enumerate(label_list)}

model = AutoModelForTokenClassification.from_pretrained(
    model_name,
    id2label=id2label, 
    label2id=label2id
)


ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer)

text = "Fox News reported that Joe Biden met with CNN executives."
predictions = ner_pipeline(text)
print(predictions)

Training Hyperparameters

Training regime: Here's the training arguments we used:

training_args = TrainingArguments(
    learning_rate=2e-5,
    per_device_train_batch_size=64,
    per_device_eval_batch_size=32,
    num_train_epochs=10,
    weight_decay=0.01,
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    output_dir="./results",
    logging_dir="./logs",
    logging_steps=50,
    group_by_length=True,
)

Evaluation

We split the data to train(80%), validation(10%) and test(10%) sets.

Results

We used common classification metrics:

precision
recall
f1-score

Overall Results:

Metric	Precision	Recall	F1-Score	Support
Macro Avg	0.6405	0.5589	0.5922	48710
Weighted Avg	0.9330	0.9418	0.9366	48710

Per-class Results:

Label	Precision	Recall	F1-Score	Support
O	0.9615	0.9792	0.9703	45921
B-BIAS	0.5314	0.4183	0.4681	930
I-BIAS	0.4286	0.2792	0.3381	1859

Environmental Impact

Total energy consumption for fine-tuning is 0.032804 kWh

Local CO2 Emission: Approximately 3.12 grams of CO₂ equivalent.

License

CC BY 4.0 (Creative Commons Attribution 4.0): Allows sharing and adaptation with proper credit.

vector-institute
/

nmb-plus-bias-ner-bert

Model Overview

Model Details

How to Get Started with the Model

Training Hyperparameters

Evaluation

Results

Overall Results:

Per-class Results:

Environmental Impact

License

Model tree for vector-institute/nmb-plus-bias-ner-bert

Dataset used to train vector-institute/nmb-plus-bias-ner-bert

Collection including vector-institute/nmb-plus-bias-ner-bert

newsmediabias-plus

Evaluation results