irenegirard commited on
Commit
1ed25f6
·
verified ·
1 Parent(s): 28600cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -10
README.md CHANGED
@@ -16,25 +16,50 @@ base_model:
16
 
17
  The Karibu project is a collaboration between pleIAs, Bibliothèque sans frontière (BSF) and Kajou. Our platform delivers comprehensive educational activities across six CEFR proficiency levels (A1 to C2), making quality language learning accessible to all, even in offline environments through microSD card deployment. By combining reading comprehension, interactive exercises, and personalized learning paths, Karibu creates an immersive educational experience that adapts to each learner's needs.
18
 
 
 
 
19
 
20
- ## Text Classification Model
 
 
 
21
 
22
- Our innovative approach begins with the creation of a rich, diverse corpus of educational content. Drawing from high-quality sources and utilizing advanced AI models, we've developed a sophisticated methodology for generating educational content based on French press articles available online (model to come). Each text undergoes a careful transformation process to create variations suitable for different proficiency levels, ensuring that learners at every stage have access to appropriate, engaging content. This systematic approach allows us to maintain high educational standards while scaling our content library effectively.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  🔍 [Explore the full dataset](https://huggingface.co/datasets/PleIAs/KaribuAI/viewer/default)
25
 
26
- ## Cultural Relevance and Ethical Content Curation
27
 
28
- Understanding the importance of cultural context in language learning, we've implemented a robust content filtering system that ensures all materials are not only educationally sound but also culturally sensitive. Our platform covers diverse topics including solidarity, African literature and history, agriculture, tourism, and cross-cultural communication. This careful curation process, powered by Celadon, guarantees that learning materials resonate with our users' experiences and educational needs while maintaining the highest standards of ethical content delivery.
 
 
 
 
 
 
 
 
29
 
30
- 🤖 [Explore the Celadon model](https://huggingface.co/PleIAs/celadon)
31
 
 
32
 
33
- ## Advanced Level Classification
 
 
34
 
35
- Our classification system precisely evaluates and assigns appropriate difficulty levels to all educational content. The system utilizes DeBERTa (Decoding-enhanced BERT with Disentangled Attention) to capture the subtle linguistic features that distinguish different CEFR levels, from basic A1 constructions to advanced C2 language use. This precision allows for consistent, reliable assessment and appropriate content delivery, creating a foundational framework for personalized learning experiences.
36
 
37
- ## AI-Powered Tutoring Experience
38
- Karibu transforms traditional language learning through its innovative dual-component system. Each learning block combines two key elements: interactive H5P-formatted exercises (including quizzes, drag-and-drop activities, and multimedia content) and an AI tutoring system for essay evaluation. The AI tutor analyzes written submissions in detail, identifying grammatical errors, suggesting improvements, and providing targeted feedback. By analyzing user performance across both structured exercises and free-form writing, our platform creates personalized learning pathways that adapt to each student's progress. Unlike conventional systems limited to multiple-choice questions and scripted interactions, Karibu offers natural, dynamic learning experiences that develop real-world language skills. Our focus on practical, task-based learning modules ensures that educators can immediately apply their knowledge in real-world teaching contexts, creating a multiplier effect that benefits entire learning communities.
39
 
40
- Karibu not only provides cutting-edge language learning tools but also contributes to the democratization of education in geographically isolated areas. Our commitment to open solutions ensures frugality, transparency, and local adaptability, making Karibu a truly transformative force in language education.
 
16
 
17
  The Karibu project is a collaboration between pleIAs, Bibliothèque sans frontière (BSF) and Kajou. Our platform delivers comprehensive educational activities across six CEFR proficiency levels (A1 to C2), making quality language learning accessible to all, even in offline environments through microSD card deployment. By combining reading comprehension, interactive exercises, and personalized learning paths, Karibu creates an immersive educational experience that adapts to each learner's needs.
18
 
19
+ ## Karibu Language Level Classifier
20
+ Karibu is a DeBERTa-based classifier that automatically assigns CEFR language proficiency levels (A1-C2) to French educational content.
21
+ Model Characteristics
22
 
23
+ ## Architecture: DeBERTa with multi-head classification
24
+ Base Model: PleIAs/celadon
25
+ Model Size: Fine-tuned from DeBERTa-v3-small
26
+ Output: 6 classification levels (A1, A2, B1, B2, C1, C2)
27
 
28
+ 🤖 [Explore the Celadon model](https://huggingface.co/PleIAs/celadon)
29
+
30
+
31
+ ## Training Details
32
+
33
+ Training Data: 9,000 synthetic samples
34
+
35
+ Source: French press articles + Wikimedia content
36
+ Processing: Sequential text simplification using an open source model (to come)
37
+ Validation: 1,000 samples per level manually verified by BSF experts
38
+
39
+ ## Topics Coverage:
40
+ - solidarity, geography, African literature, agriculture, tourism, cultural events, African history, geopolitics, communication
41
+ Topic Filtering: Meta-Llama-3-8B-Instruct for content categorization
42
+ Annotation Method:
43
 
44
  🔍 [Explore the full dataset](https://huggingface.co/datasets/PleIAs/KaribuAI/viewer/default)
45
 
 
46
 
47
+ ## levels
48
+ Manual verification using CEFR framework criteria
49
+ Statistical validation using Louvain word-level classification
50
+
51
+ ## Technical Integration
52
+
53
+ Deployment: Offline-capable via microSD cards
54
+ Format: H5P-compatible for interactive exercises
55
+ Input Processing: Handles various text types (academic writing, press articles, emails, letters, stories)
56
 
 
57
 
58
+ ## Collaborators
59
 
60
+ PleIAs: Technical development
61
+ Bibliothèque Sans Frontières (BSF): Educational expertise
62
+ Kajou: Distribution platform
63
 
 
64
 
 
 
65