irenegirard commited on
Commit
e893961
·
verified ·
1 Parent(s): 0865da6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -18
README.md CHANGED
@@ -24,48 +24,44 @@ The Karibu project is a collaboration between pleIAs, Bibliothèque sans fronti
24
 
25
  ## Karibu Language Level Classifier
26
  Karibu is a DeBERTa-based classifier that automatically assigns CEFR language proficiency levels (A1-C2) to French educational content.
27
- Model Characteristics
28
 
29
  ## Architecture: DeBERTa with multi-head classification
30
- Base Model: PleIAs/celadon
31
- Model Size: Fine-tuned from DeBERTa-v3-small
32
- Output: 6 classification levels (A1, A2, B1, B2, C1, C2)
33
 
34
  🤖 [Explore the Celadon model](https://huggingface.co/PleIAs/celadon)
35
 
36
 
37
  ## Training Details
38
 
39
- Training Data: 9,000 synthetic samples
40
 
41
- Source: French press articles + Wikimedia content
42
- Processing: Sequential text simplification using an open source model (to come)
43
- Validation: 1,000 samples per level manually verified by BSF experts
44
 
45
  ## Topics Coverage:
46
  - solidarity, geography, African literature, agriculture, tourism, cultural events, African history, geopolitics, communication
47
- Topic Filtering: Meta-Llama-3-8B-Instruct for content categorization
48
- Annotation Method:
49
 
50
  🔍 [Explore the full dataset](https://huggingface.co/datasets/PleIAs/KaribuAI/viewer/default)
51
 
52
 
53
  ## levels
54
- Manual verification using CEFR framework criteria
55
- Statistical validation using Louvain word-level classification
56
 
57
  ## Technical Integration
58
 
59
- Deployment: Offline-capable via microSD cards
60
- Format: H5P-compatible for interactive exercises
61
- Input Processing: Handles various text types (academic writing, press articles, emails, letters, stories)
62
 
63
 
64
  ## Collaborators
65
 
66
- PleIAs: Technical development
67
- Bibliothèque Sans Frontières (BSF): Educational expertise
68
- Kajou: Distribution platform
69
 
70
 
71
 
 
24
 
25
  ## Karibu Language Level Classifier
26
  Karibu is a DeBERTa-based classifier that automatically assigns CEFR language proficiency levels (A1-C2) to French educational content.
 
27
 
28
  ## Architecture: DeBERTa with multi-head classification
29
+ - Base Model: PleIAs/celadon
30
+ - Model Size: Fine-tuned from DeBERTa-v3-small
31
+ - Output : 6 classification levels (A1, A2, B1, B2, C1, C2)
32
 
33
  🤖 [Explore the Celadon model](https://huggingface.co/PleIAs/celadon)
34
 
35
 
36
  ## Training Details
37
 
38
+ - Training Data: 9,000 synthetic samples
39
 
40
+ - Source: French press articles + Wikimedia content
41
+ - Processing: Sequential text simplification using an open source model (to come)
42
+ - Validation: 1,000 samples per level manually verified by BSF experts
43
 
44
  ## Topics Coverage:
45
  - solidarity, geography, African literature, agriculture, tourism, cultural events, African history, geopolitics, communication
46
+ - Topic Filtering: Meta-Llama-3-8B-Instruct for content categorization
 
47
 
48
  🔍 [Explore the full dataset](https://huggingface.co/datasets/PleIAs/KaribuAI/viewer/default)
49
 
50
 
51
  ## levels
52
+ - Manual verification using CEFR framework criteria
53
+ - Statistical validation using Louvain word-level classification
54
 
55
  ## Technical Integration
56
 
57
+ - Deployment: Offline-capable via microSD cards
58
+ - Format: H5P-compatible for interactive exercises
59
+ - Input Processing: Handles various text types (academic writing, press articles, emails, letters, stories)
60
 
61
 
62
  ## Collaborators
63
 
64
+ PleIAs: Technical development, Bibliothèque Sans Frontières (BSF): Educational expertise, Kajou: Distribution platform
 
 
65
 
66
 
67