Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- HiTZ/CONAN-EUS
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
metrics:
|
8 |
+
- bleu
|
9 |
+
library_name: transformers
|
10 |
+
pipeline_tag: text2text-generation
|
11 |
+
---
|
12 |
+
|
13 |
+
# Model Card for mT5-counternarrative-en
|
14 |
+
|
15 |
+
This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech.
|
16 |
+
The model has been fine-tuned on the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) splits of the
|
17 |
+
original [CONAN](https://aclanthology.org/P19-1271.pdf dataset).
|
18 |
+
|
19 |
+
CONAN-EUS was created by professionally translating all 6654 English HS-CN pairs of the original [CONAN](https://aclanthology.org/P19-1271.pdf) dataset into
|
20 |
+
**Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
|
21 |
+
|
22 |
+
<table style="width:33%">
|
23 |
+
<tr>
|
24 |
+
<th>CONAN-EUS Splits</th>
|
25 |
+
<th>Total HS-CN Count</th>
|
26 |
+
<tr>
|
27 |
+
<td>train</td>
|
28 |
+
<td>4833</td>
|
29 |
+
</tr>
|
30 |
+
<tr>
|
31 |
+
<td>validation</td>
|
32 |
+
<td>537</td>
|
33 |
+
</tr>
|
34 |
+
<tr>
|
35 |
+
<td>test</td>
|
36 |
+
<td>1278</td>
|
37 |
+
</tr>
|
38 |
+
</table>
|
39 |
+
|
40 |
+
- 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation]() In LREC-COLING 2024.
|
41 |
+
- 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
|
42 |
+
|
43 |
+
The CONAN (COunter NArratives through Nichesourcing) dataset was published by [Chung et al., 2019](https://aclanthology.org/P19-1271.pdf)
|
44 |
+
and is publicly available in [https://github.com/marcoguerini/CONAN](https://github.com/marcoguerini/CONAN).
|
45 |
+
|
46 |
+
## HS-CN example
|
47 |
+
|
48 |
+
The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
|
49 |
+
An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
|
50 |
+
|
51 |
+
| HS | CN |
|
52 |
+
|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
|
53 |
+
| Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
|
54 |
+
| Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
|
55 |
+
| Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
|
56 |
+
|
57 |
+
If you use the model please **cite the following paper**:
|
58 |
+
|
59 |
+
## Citation
|
60 |
+
|
61 |
+
```bibtex
|
62 |
+
@inproceedings{bengoetxea-et-al-2024,
|
63 |
+
title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
|
64 |
+
author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
|
65 |
+
year={2024},
|
66 |
+
publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
|
67 |
+
}
|
68 |
+
```
|
69 |
+
|
70 |
+
**Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
|
71 |
+
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
|
72 |
+
|