bourdoiscatie
commited on
Commit
·
7887897
1
Parent(s):
7fa7b52
Update README.md
Browse files
README.md
CHANGED
@@ -17,24 +17,24 @@ widget:
|
|
17 |
- text: "Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
18 |
library_name: transformers
|
19 |
pipeline_tag: token-classification
|
20 |
-
co2_eq_emissions:
|
21 |
---
|
22 |
|
23 |
|
24 |
-
# Camembert-base-
|
25 |
|
26 |
## Model Description
|
27 |
|
28 |
We present **Camembert-base-frenchNER_4entities**, which is a [CamemBERT base](https://huggingface.co/camembert-base) fine-tuned for the Name Entity Recognition task for the French language on four French NER datasets for 4 entities (LOC, PER, ORG, MISC).
|
29 |
All these datasets were concatenated and cleaned into a single dataset that we called [frenchNER_4entities](https://huggingface.co/datasets/CATIE-AQ/frenchNER_4entities).
|
30 |
-
There are a total of **384,773** rows, of which **328,757** are for training, **24,131** for validation and **31,885** for testing.
|
31 |
Our methodology is described in a blog post available in [English](https://blog.vaniila.ai/en/NER_en/) or [French](https://blog.vaniila.ai/NER/).
|
32 |
|
33 |
|
34 |
|
35 |
## Dataset
|
36 |
|
37 |
-
The dataset used is [
|
38 |
* PER: personality ;
|
39 |
* LOC: location ;
|
40 |
* ORG: organization ;
|
|
|
17 |
- text: "Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
18 |
library_name: transformers
|
19 |
pipeline_tag: token-classification
|
20 |
+
co2_eq_emissions: 20
|
21 |
---
|
22 |
|
23 |
|
24 |
+
# Camembert-base-frenchNER_4entities
|
25 |
|
26 |
## Model Description
|
27 |
|
28 |
We present **Camembert-base-frenchNER_4entities**, which is a [CamemBERT base](https://huggingface.co/camembert-base) fine-tuned for the Name Entity Recognition task for the French language on four French NER datasets for 4 entities (LOC, PER, ORG, MISC).
|
29 |
All these datasets were concatenated and cleaned into a single dataset that we called [frenchNER_4entities](https://huggingface.co/datasets/CATIE-AQ/frenchNER_4entities).
|
30 |
+
There are a total of **384,773** rows, of which **328,757** are for training, **24,131** for validation and **31,885** for testing.
|
31 |
Our methodology is described in a blog post available in [English](https://blog.vaniila.ai/en/NER_en/) or [French](https://blog.vaniila.ai/NER/).
|
32 |
|
33 |
|
34 |
|
35 |
## Dataset
|
36 |
|
37 |
+
The dataset used is [frenchNER_4entities](https://huggingface.co/datasets/CATIE-AQ/frenchNER_4entities), which represents ~385k sentences labeled in 4 categories :
|
38 |
* PER: personality ;
|
39 |
* LOC: location ;
|
40 |
* ORG: organization ;
|