iceman2434
/

xlm-roberta-base-fake-news-detection-tl

Text Classification

fake-news-detection

Model card Files Files and versions Community

xlm-roberta-base-fake-news-detection-tl / README.md

iceman2434's picture

Update README.md

c4f42b0 verified 3 months ago

|

history blame contribute delete

1.18 kB

metadata

datasets:
  - jcblaise/fake_news_filipino
  - SEACrowd/ph_fake_news_corpus
language:
  - tl
  - en
base_model:
  - FacebookAI/xlm-roberta-base
pipeline_tag: text-classification
tags:
  - fake-news-detection
  - text-classification
  - tagalog
  - filipino
metrics:
  - accuracy
  - f1
  - precision
  - recall

Tagalog Fake News Detection Model

Overview

This project implements a fake news detection model for Tagalog/Filipino using the XLM-RoBERTa base model with an accuracy of 95.46%.

Dataset

Total Size: 18,522 samples
Composition: 50/50 split of real and fake news
Languages: Filipino, English

Dataset Split

Train Set: ~12,968 samples
Validation Set: ~2,784 samples
Test Set: ~2,770 samples

Performance Metrics (on Evaluation Set)

Accuracy: 95.46%
F1 Score: 95.40%
Precision: 95.40%
Recall: 95.40%

Data Sources

The model was trained on a combined dataset from two primary sources:

Fake News Filipino Dataset
- 3,206 rows used
Philippine Fake News Corpus
- 15,312 rows used out of 22,458 available