--- datasets: - AfnanTS/Final_ArLAMA_DS_tokenized_for_ARBERTv2 language: - ar base_model: - UBC-NLP/ARBERTv2 pipeline_tag: fill-mask --- Model Logo **ARBERT_ArLAMA** is a pre-trained Arabic language model fine-tuned using Masked Language Modeling (MLM) tasks. This model leverages Knowledge Graphs (KGs) to capture semantic relations in Arabic text, aiming to improve vocabulary comprehension and performance in downstream tasks. ## Uses ### Direct Use Filling masked tokens in Arabic text, particularly in contexts enriched with knowledge from KGs. ### Downstream Use Can be further fine-tuned for Arabic NLP tasks that require semantic understanding, such as text classification or question answering. ## How to Get Started with the Model ```python from transformers import pipeline fill_mask = pipeline("fill-mask", model="AfnanTS/ARBERT_ArLAMA") fill_mask("اللغة [MASK] مهمة جدا." ``` ## Training Details ### Training Data Trained on the ArLAMA dataset, which is designed to represent Knowledge Graphs in natural language. ### Training Procedure Continued pre-training of ArBERTv1 using Masked Language Modeling (MLM) to integrate KG-based knowledge.