BERT SMALL + Typo Detection ββββ
BERT SMALL fine-tuned on GitHub Typo Corpus for typo detection (using NER style)
Details of the downstream task (Typo detection as NER)
Dataset: GitHub Typo Corpus π
Fine-tune script on NER dataset provided by Huggingface ποΈββοΈ
Metrics on test set π
Metric | # score |
---|---|
F1 | 89.12 |
Precision | 93.82 |
Recall | 84.87 |
Model in action π¨
Fast usage with pipelines π§ͺ
from transformers import pipeline
typo_checker = pipeline(
"ner",
model="mrm8488/bert-small-finetuned-typo-detection",
tokenizer="mrm8488/bert-small-finetuned-typo-detection"
)
result = typo_checker("here there is an error in coment")
result[1:-1]
# Output:
[{'entity': 'ok', 'score': 0.9021041989326477, 'word': 'here'},
{'entity': 'ok', 'score': 0.7975626587867737, 'word': 'there'},
{'entity': 'ok', 'score': 0.8596242070198059, 'word': 'is'},
{'entity': 'ok', 'score': 0.7071516513824463, 'word': 'an'},
{'entity': 'ok', 'score': 0.943381130695343, 'word': 'error'},
{'entity': 'ok', 'score': 0.8047608733177185, 'word': 'in'},
{'entity': 'ok', 'score': 0.8240702152252197, 'word': 'come'},
{'entity': 'typo', 'score': 0.5004884004592896, 'word': '##nt'}]
It worksπ! we typed coment
instead of comment
Let's try with another example
result = typo_checker("Adddd validation midelware")
result[1:-1]
# Output:
[{'entity': 'ok', 'score': 0.7128152847290039, 'word': 'add'},
{'entity': 'typo', 'score': 0.5388424396514893, 'word': '##dd'},
{'entity': 'ok', 'score': 0.94792640209198, 'word': 'validation'},
{'entity': 'typo', 'score': 0.5839331746101379, 'word': 'mid'},
{'entity': 'ok', 'score': 0.5195121765136719, 'word': '##el'},
{'entity': 'ok', 'score': 0.7222476601600647, 'word': '##ware'}]
Yeah! We typed wrong Add and middleware
Created by Manuel Romero/@mrm8488
Made with β₯ in Spain
- Downloads last month
- 114
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.