leibniz-hbi
/

xlm-roberta-olid

Text Classification

Model card Files Files and versions Community

xlm-roberta-olid / README.md

grenwi's picture

Update README.md

4635ebf almost 2 years ago

|

history blame contribute delete

3.85 kB

	---
	tags:
	- flair
	- text-classification
	language:
	- multilingual
	- en
	library_name: flair
	widget:
	- text: This is a gentle comment.
	license: mit
	pipeline_tag: text-classification
	---

	# Offensive language detection

	## Tasks

	The model combines three classifiers for all three tasks of the OLID dataset [1].

	- subtask a: OFF, NOT
	- subtask b: TIN, UNT
	- subtask c: IND, GRP, OTH

	Trained with [Flair NLP](https://github.com/flairNLP/flair) as a multi-task model.

	Training data: [Offensive Language Identification Dataset](https://sites.google.com/site/offensevalsharedtask/olid) (OLID) V1.0 [1]
	Test data: test set from [Semi-Supervised Dataset for Offensive Language Identification](https://sites.google.com/site/offensevalsharedtask/solid) (SOLID) [2]

	## Citation

	When using this model, please cite:

	> Gregor Wiedemann, Seid Muhie Yimam, and Chris Biemann. 2020. UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1638–1644, Barcelona (online). International Committee for Computational Linguistics.


	## Evaluation scores

	Evaluation was conducted on the SemEval 2020 Task 12 English test set. Thus, results can be compared to [3]

	### Task A

	```
	Results:
	- F-score (micro) 0.9256
	- F-score (macro) 0.9131
	- Accuracy 0.9256

	By class:
	precision recall f1-score support

	NOT 0.9922 0.9042 0.9461 2807
	OFF 0.7976 0.9815 0.8800 1080

	accuracy 0.9256 3887
	macro avg 0.8949 0.9428 0.9131 3887
	weighted avg 0.9381 0.9256 0.9278 3887
	```

	### Task B
	```
	Results:
	- F-score (micro) 0.7138
	- F-score (macro) 0.6408
	- Accuracy 0.7138

	By class:
	precision recall f1-score support

	TIN 0.6826 0.9741 0.8027 850
	UNT 0.8947 0.3269 0.4789 572

	accuracy 0.7138 1422
	macro avg 0.7887 0.6505 0.6408 1422
	weighted avg 0.7679 0.7138 0.6724 1422
	```

	### Task C
	```
	Results:
	- F-score (micro) 0.8318
	- F-score (macro) 0.6978
	- Accuracy 0.8318

	By class:
	precision recall f1-score support

	IND 0.8703 0.9483 0.9076 580
	GRP 0.7216 0.6684 0.6940 190
	OTH 0.7143 0.3750 0.4918 80

	accuracy 0.8318 850
	macro avg 0.7687 0.6639 0.6978 850
	weighted avg 0.8223 0.8318 0.8207 850
	```

	----
	# References

	[1] Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019. Predicting the Type and Target of Offensive Posts in Social Media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1415–1420, Minneapolis, Minnesota. Association for Computational Linguistics.

	[2] Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Marcos Zampieri, and Preslav Nakov. 2021. SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 915–928, Online. Association for Computational Linguistics.

	[3] Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, and Çağrı Çöltekin. 2020. SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020). In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1425–1447, Barcelona (online). International Committee for Computational Linguistics.