ragerri commited on
Commit
94b2fe9
·
verified ·
1 Parent(s): 9a150bc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - HiTZ/CONAN-EUS
5
+ language:
6
+ - en
7
+ metrics:
8
+ - bleu
9
+ library_name: transformers
10
+ pipeline_tag: text2text-generation
11
+ ---
12
+
13
+ # Model Card for mT5-counternarrative-en
14
+
15
+ This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech.
16
+ The model has been fine-tuned on the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) splits of the
17
+ original [CONAN](https://aclanthology.org/P19-1271.pdf dataset).
18
+
19
+ CONAN-EUS was created by professionally translating all 6654 English HS-CN pairs of the original [CONAN](https://aclanthology.org/P19-1271.pdf) dataset into
20
+ **Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
21
+
22
+ <table style="width:33%">
23
+ <tr>
24
+ <th>CONAN-EUS Splits</th>
25
+ <th>Total HS-CN Count</th>
26
+ <tr>
27
+ <td>train</td>
28
+ <td>4833</td>
29
+ </tr>
30
+ <tr>
31
+ <td>validation</td>
32
+ <td>537</td>
33
+ </tr>
34
+ <tr>
35
+ <td>test</td>
36
+ <td>1278</td>
37
+ </tr>
38
+ </table>
39
+
40
+ - 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation]() In LREC-COLING 2024.
41
+ - 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
42
+
43
+ The CONAN (COunter NArratives through Nichesourcing) dataset was published by [Chung et al., 2019](https://aclanthology.org/P19-1271.pdf)
44
+ and is publicly available in [https://github.com/marcoguerini/CONAN](https://github.com/marcoguerini/CONAN).
45
+
46
+ ## HS-CN example
47
+
48
+ The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
49
+ An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
50
+
51
+ | HS | CN |
52
+ |-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
53
+ | Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
54
+ | Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
55
+ | Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
56
+
57
+ If you use the model please **cite the following paper**:
58
+
59
+ ## Citation
60
+
61
+ ```bibtex
62
+ @inproceedings{bengoetxea-et-al-2024,
63
+ title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
64
+ author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
65
+ year={2024},
66
+ publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
67
+ }
68
+ ```
69
+
70
+ **Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
71
+ HiTZ Center - Ixa, University of the Basque Country UPV/EHU
72
+