Mawared_T1 / README.md
Daemontatox's picture
Adding Evaluation Results (#2)
816cf8e verified
---
base_model: arcee-ai/Meraj-Mini
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
license: apache-2.0
language:
- ar
- en
model-index:
- name: MawaredT1
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: wis-k/instruction-following-eval
split: train
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 41.99
name: averaged accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: SaylorTwift/bbh
split: test
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 31.9
name: normalized accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: lighteval/MATH-Hard
split: test
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 14.58
name: exact match
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
split: train
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 11.3
name: acc_norm
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 18.68
name: acc_norm
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 41.31
name: accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FMawaredT1
name: Open LLM Leaderboard
---
![image](./image.webp)
# Bilingual Assistant Model Card
## Overview
This bilingual language model is designed to support seamless text generation and understanding in both Arabic (ar) and English (en). Fine-tuned from the `arcee-ai/Meraj-Mini` base model, it offers robust multilingual capabilities optimized for various applications such as conversational agents, content creation, and multilingual text analysis.
### Key Highlights
- **Multilingual Proficiency:** Designed to handle complex linguistic nuances in both Arabic and English, ensuring high-quality outputs in both languages.
- **Performance Optimization:** Achieved 2x faster training through innovative methods provided by the [Unsloth](https://github.com/unslothai/unsloth) framework and the Hugging Face TRL library.
- **Transformer-Based Architecture:** Utilizes advanced transformer layers to deliver state-of-the-art performance in text generation and inference.
## Development Details
- **Developer:** Daemontatox
- **License:** Licensed under the Apache-2.0, ensuring open accessibility and flexibility for various use cases.
- **Base Model:** The model is a fine-tuned variant of `arcee-ai/Meraj-Mini`.
- **Frameworks Used:**
- [Unsloth](https://github.com/unslothai/unsloth): Enabled faster and more efficient training.
- Hugging Face TRL Library: Provided tools for reinforcement learning fine-tuning, enhancing model responsiveness and accuracy.
## Training Process
The fine-tuning process was conducted with a focus on:
- **Data Diversity:** Leveraged a bilingual corpus to ensure comprehensive language understanding across both supported languages.
- **Optimized Hardware Utilization:** Implemented Unsloth's accelerated training methods, significantly reducing resource consumption and training time.
- **Reinforcement Learning:** Used Hugging Face's TRL library to fine-tune the model's decision-making and response generation capabilities, particularly for conversational and contextual understanding.
## Applications
This model is suited for a variety of real-world applications, including:
1. **Conversational Agents:** Powering bilingual chatbots and virtual assistants for customer support and personal use.
2. **Content Generation:** Assisting in drafting multilingual articles, social media posts, and creative writing.
3. **Translation Support:** Providing context-aware translations and summaries across Arabic and English.
4. **Education:** Enhancing learning platforms by offering bilingual educational content and interactive learning experiences.
## Future Directions
Plans for extending the model's capabilities include:
- **Additional Language Support:** Exploring fine-tuning for additional languages.
- **Domain-Specific Training:** Specializing the model for industries such as healthcare, legal, and technical writing.
- **Optimization for Edge Devices:** Investigating quantization techniques to deploy the model on resource-constrained hardware like mobile devices and IoT platforms.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/Daemontatox__MawaredT1-details)!
Summarized results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FMawaredT1&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
| Metric |Value (%)|
|-------------------|--------:|
|**Average** | 26.63|
|IFEval (0-Shot) | 41.99|
|BBH (3-Shot) | 31.90|
|MATH Lvl 5 (4-Shot)| 14.58|
|GPQA (0-shot) | 11.30|
|MuSR (0-shot) | 18.68|
|MMLU-PRO (5-shot) | 41.31|