Tevfik-istanbullu
/

Arabic_Named_Entity_Recognition_NER_Model

Token Classification

Model card Files Files and versions Community

Tevfik istanbullu commited on Nov 12, 2024

Commit

547994d

·

verified ·

1 Parent(s): 7580568

Update README.md

Files changed (1) hide show

README.md +43 -3

README.md CHANGED Viewed

@@ -1,3 +1,43 @@
----
-license: mit
----

+---
+license: mit
+language:
+- ar
+metrics:
+- accuracy
+---
+### Arabic Named Entity Recognition (NER) Model
+# Overview
+This (NER) model specifically designed for the Arabic language. Built from scratch without the use of pretrained models, this model is capable of recognizing entities such as:
+company names, names, cities, etc.
+The model is trained using TensorFlow and works with a custom dataset split into training, validation, and test sets.
+# Model Highlights
+- Language: Arabic
+- Framework: TensorFlow
+- Data Format: Text files (txt format) with train, validation, and test splits
+# Entities Recognized:
+- ORG: Organizations (e.g., company names)
+- LOC: Locations (e.g., cities, countries)
+- PERS: Persons (e.g., names, excluding common/popular names)
+- MISC: Miscellaneous (e.g., other identifiable private information)
+-Intended Use: Arabic text processing, personal data anonymization, data extraction.
+# Dataset and Preprocessing
+The dataset used in this model is split into three parts:
+- Training Set: For model training.
+- Validation Set: For tuning model hyperparameters and monitoring overfitting.
+- Test Set: For evaluating final model performance.
+Each sample in the dataset contains labeled entities for efficient supervised learning.
+Data preprocessing steps include tokenization, normalization, and conversion of entities into a suitable format compatible with TensorFlow.
+# Model Evaluation
+The model achieved a Test Accuracy of # 0.9675#  on the test set, indicating strong performance in recognizing and classifying entities in Arabic text.