iceman2434's picture
Update README.md
c4f42b0 verified
metadata
datasets:
  - jcblaise/fake_news_filipino
  - SEACrowd/ph_fake_news_corpus
language:
  - tl
  - en
base_model:
  - FacebookAI/xlm-roberta-base
pipeline_tag: text-classification
tags:
  - fake-news-detection
  - text-classification
  - tagalog
  - filipino
metrics:
  - accuracy
  - f1
  - precision
  - recall

Tagalog Fake News Detection Model

Overview

This project implements a fake news detection model for Tagalog/Filipino using the XLM-RoBERTa base model with an accuracy of 95.46%.

Dataset

  • Total Size: 18,522 samples
  • Composition: 50/50 split of real and fake news
  • Languages: Filipino, English

Dataset Split

  • Train Set: ~12,968 samples
  • Validation Set: ~2,784 samples
  • Test Set: ~2,770 samples

Performance Metrics (on Evaluation Set)

  • Accuracy: 95.46%
  • F1 Score: 95.40%
  • Precision: 95.40%
  • Recall: 95.40%

Data Sources

The model was trained on a combined dataset from two primary sources:

  1. Fake News Filipino Dataset

    • 3,206 rows used
  2. Philippine Fake News Corpus

    • 15,312 rows used out of 22,458 available