|
--- |
|
library_name: span-marker |
|
tags: |
|
- span-marker |
|
- token-classification |
|
- ner |
|
- named-entity-recognition |
|
- generated_from_span_marker_trainer |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
widget: [] |
|
pipeline_tag: token-classification |
|
language: |
|
- ar |
|
--- |
|
|
|
# SpanMarker |
|
|
|
This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. |
|
|
|
## Model Details |
|
|
|
Details are here - https://iahlt.github.io/arabic_ner/ |
|
|
|
|
|
### Model Description |
|
- **Model Type:** SpanMarker |
|
<!-- - **Encoder:** [Unknown](https://huggingface.co./unknown) --> |
|
- **Maximum Sequence Length:** 512 tokens |
|
- **Maximum Entity Length:** 150 words |
|
<!-- - **Training Dataset:** [Unknown](https://huggingface.co./datasets/unknown) --> |
|
<!-- - **Language:** Unknown --> |
|
<!-- - **License:** Unknown --> |
|
|
|
### Tags |
|
``` |
|
ANG - Any named language (Hebrew, Arabic, English, French, etc.) |
|
DUC - A branded product, objects, vehicles, medicines, foods, etc. (Apple, BMW, Coca-Cola, etc.) |
|
EVE - Any named event (Olympics, World Cup, etc.) |
|
FAC - Any named facility, building, airport, etc. (Eiffel Tower, Ben Gurion Airport, etc.) |
|
GPE - Geo-political entity, nation states, counties, cities, etc. |
|
INFORMAL - Informal language (slang) |
|
LOC - Non-GPE locations, geographical regions, mountain ranges, bodies of water, etc. |
|
ORG - Companies, agencies, institutions, political parties, etc. |
|
PER - People, including fictional. |
|
TIMEX - Time expression, absolute or relative dates or periods. |
|
TTL - Any named title, position, profession, etc. (President, Prime Minister, etc.) |
|
WOA - Any named work of art (books, movies, songs, etc.) |
|
MISC - Miscellaneous entities, that do not belong to the previous categories |
|
``` |
|
|
|
## Uses |
|
|
|
### Direct Use for Inference |
|
|
|
```python |
|
from span_marker import SpanMarkerModel |
|
|
|
# Download from the 🤗 Hub |
|
model = SpanMarkerModel.from_pretrained("iahlt/xlm-roberta-base-ar-ner-flat") |
|
entities = model.predict(<text>) |
|
print(entities) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Framework Versions |
|
- Python: 3.10.12 |
|
- SpanMarker: 1.5.0 |
|
- Transformers: 4.35.2 |
|
- PyTorch: 2.1.0+cu121 |
|
- Datasets: 2.16.1 |
|
- Tokenizers: 0.15.1 |
|
|
|
## Citation |
|
|
|
### BibTeX |
|
|
|
``` |
|
@software{Aarsen_SpanMarker, |
|
author = {Aarsen, Tom}, |
|
license = {Apache-2.0}, |
|
title = {{SpanMarker for Named Entity Recognition}}, |
|
url = {https://github.com/tomaarsen/SpanMarkerNER} |
|
} |
|
``` |
|
|