Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Tagalog Fake News Detection Model
|
2 |
+
|
3 |
+
## Overview
|
4 |
+
This project implements a fake news detection model for Tagalog/Filipino using the XLM-RoBERTa base model with an accuracy of 95.46%.
|
5 |
+
|
6 |
+
### Dataset
|
7 |
+
- Total Size: 18,522 samples
|
8 |
+
- Composition: 50/50 split of real and fake news
|
9 |
+
- Languages: Filipino and English
|
10 |
+
-
|
11 |
+
#### Dataset Split
|
12 |
+
- Train Set: ~12,968 samples
|
13 |
+
- Validation Set: ~2,784 samples
|
14 |
+
- Test Set: ~2,770 samples
|
15 |
+
|
16 |
+
### Performance Metrics (on Evaluation Set)
|
17 |
+
- Accuracy: 95.46%
|
18 |
+
- F1 Score: 95.40%
|
19 |
+
- Precision: 95.40%
|
20 |
+
- Recall: 95.40%
|
21 |
+
|
22 |
+
|
23 |
+
## Data Sources
|
24 |
+
The model was trained on a combined dataset from two primary sources:
|
25 |
+
|
26 |
+
1. [Fake News Filipino Dataset](https://huggingface.co/datasets/jcblaise/fake_news_filipino)
|
27 |
+
- 3,206 rows used
|
28 |
+
|
29 |
+
2. [Philippine Fake News Corpus](https://github.com/aaroncarlfernandez/Philippine-Fake-News-Corpus)
|
30 |
+
- 15,312 rows used out of 22,458 available
|