[**中文**](./README.md) | [**English**](./README_en.md) # UniMC EMNLP 2022 论文 《[Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective](https://arxiv.org/abs/2210.08590)》源码 ![](./unimc.jpg) ## Update - [2022-10-18] Release preprint in arXiv. - [2022-10-14] Release code in GitHub. ## Requirements 安装 fengshen 框架 ```shell git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git cd Fengshenbang-LM pip install --editable . ``` ## Quick Start 你可以参考我们的 [example.py](./example.py) 脚本,只需要将处理好的 train、dev、test 即输入模型即可。 ```python import argparse from fengshen.pipelines.multiplechoice import UniMCPipelines total_parser = argparse.ArgumentParser("TASK NAME") total_parser = UniMCPipelines.piplines_args(total_parser) args = total_parser.parse_args() pretrained_model_path = 'IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese' args.learning_rate=2e-5 args.max_length=512 args.max_epochs=3 args.batchsize=8 args.default_root_dir='./' model = UniMCPipelines(args,model_path=pretrained_model_path) train_data = [] dev_data = [] test_data = [{ "texta": "就是废物,充电不进害得老子把主板烧了,客服不耐烦", "textb": "", "question": "", "choice": ["这是一条差评", "这是一条好评"], "answer": "这是一条差评", "label": 0, "id": 31 }] if args.train: model.train(train_data, dev_data) result = model.predict(test_data) ``` ## Pretrained Model 对于英文模型,我们使用14份 multiplechoice 数据集进行了预训练。在中文模型中,我们已经收集了48份数据集对模型进行预训练,我们已经将预训练模型开源到 HuggingFace 社区当中。 | 模型 | 地址 | |:---------:|:--------------:| | Erlangshen-UniMC-Albert-235M-English | [https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English) | | Erlangshen-UniMC-RoBERTa-110M-Chinese | [https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | | Erlangshen-UniMC-RoBERTa-330M-Chinese | [https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | | Erlangshen-UniMC-MegatronBERT-1.3B-Chinese | [https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | ## Experiments ### English 为了测评 UniMC 的性能,在英文中,我们使用 14份 multiple-choice 数据集(具体数据参考原论文)来对模型进行预训练,使其具备做选择题的能力, **Zero-shot** | Model | T0 11B | GLaM 60B | FLAN 137B | PaLM 540B | UniMC 235M | |---------|--------|----------|-----------|-----------|------------| | ANLI R1 | 43.6 | 40.9 | 47.7 | 48.4 | **52.0** | | ANLI R2 | 38.7 | 38.2 | 43.9 | 44.2 | **44.4** | | ANLI R3 | 41.3 | 40.9 | 47.0 | 45.7 | **47.8** | | CB | 70.1 | 33.9 | 64.1 | 51.8 | **75.7** | ### Chinese 为了测评 UniMC 在中文场景下的性能我们使用 13份 有监督数据集来对模型进行预训练,预训练数据如下: | Task type | Task | # of option | Data size | |---------|--------|----------|-----------| | Multiple-choice | c3 | 4 | 11.8k | | Multiple-choice | ClozeT | 2 | 0.7k | | Multiple-choice | CMRC2019 | n | 11.4k | | Multiple-choice | GCRC | 4 | 7.8k | | Classification | DuEE-Fin | 12 | 4.3k | | Classification | DuEE1.0 | 65 | 10.3k | | Classification | Fudan | 20 | 19.6k | | Classification | THUNEWS | 10 | 180k | | NLI | CMNLI | 3 | 39k | | NLI | SNLI | 3 | 545.8k | | Paraphrace | AFQMC | 2 | 34.3k | | Paraphrace | PAWS-X | 2 | 49k | | Paraphrace | STS-B | 2 | 80k | 我们使用中文领域常用的benchmark来测试UniMC的性能,具体是FewCLUE的9个任务,我们在 test_public 上测评模型的性能。 **Few-shot** | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg | |------------|------------|----------|-----------|----------|-----------|-----------|-----------|----------|-----------|-----------| | Finetuning | 65.4 | 35.5 | 49 | 32.8 | 33 | 60.7 | 14.9 | 50 | 55.6 | 44.1 | | PET | 86.7 | 51.7 | 54.5 | 46 | 44 | 56 | 61.2 | 59.4 | 57.5 | 57.44 | | LM-BFF | 85.6 | 54.4 | 53 | 47.1 | 41.6 | 57.6 | 61.2 | 51.7 | 54.7 | 56.32 | | P-tuning | 88.3 | 56 | 54.2 | **57.6** | 41.9 | 60.9 | 59.3 | **62.9** | 58.1 | 59.91 | | EFL | 84.9 | 45 | 52.1 | 42.7 | 66.2 | 71.8 | 30.9 | 56.6 | 53 | 55.91 | | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 88.64 | 54.08 | 54.32 | 48.6 | 66.55 | 73.76 | 67.71 | 52.54 | 59.92 | 62.86 | | [UniMC-RoBERTa-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 89.53 | 57.3 | 54.25 | 50 | 70.59 | 77.49 | 78.09 | 55.73 | 65.16 | 66.46 | | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **89.278** | **60.9** | **57.46** | 52.89 | **76.33** | **80.37** | **90.33** | 61.73 | **79.15** | **72.05** | **Zero-shot** | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg | |---------------|-----------|-----------|-----------|-----------|-----------|----------|----------|----------|-----------|-----------| | GPT-zero | 57.5 | 26.2 | 37 | 19 | 34.4 | 50 | 65.6 | 50.1 | 50.3 | 43.4 | | PET-zero | 85.2 | 12.6 | 26.1 | 26.6 | 40.3 | 50.6 | 57.6 | 52.2 | 54.7 | 45.1 | | NSP-BERT | 86.9 | 47.6 | 51 | 41.6 | 37.4 | 63.4 | 52 | **64.4** | 59.4 | 55.96 | | ZeroPrompt | - | - | - | 16.14 | 46.16 | - | - | - | 47.98 | - | | Yuan1.0-13B | 88.13 | 38.99 | 57.47 | 38.82 | 48.13 | 59.38 | 86.14 | 50 | 38.99 | 56.22 | | ERNIE3.0-240B | 88.75 | **50.97** | **57.83** | **40.42** | 53.57 | 64.38 | 87.13 | 56.25 | 53.46 | 61.41 | | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 86.16 | 31.26 | 46.61 | 26.54 | 66.91 | 73.34 | 66.68 | 50.09 | 53.66 | 55.7 | | [UniMC-RoBERTa-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 87.5 | 30.4 | 47.6 | 31.5 | 69.9 | 75.9 | 78.17 | 49.5 | 60.55 | 59.01 | | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **88.79** | 42.06 | 55.21 | 33.93 | **75.57** | **79.5** | **89.4** | 50.25 | **66.67** | **64.53** | ## Dataset 我们已经定义好了 UniMC 所需的数据格式,你只需要将数据转化为下面的数据格式即可: ### 文本分类 ```json { "texta": "街头偶遇2018款长安CS35,颜值美炸!或售6万起,还买宝骏510?", "textb": "", "question": "下面新闻属于哪一个类别?", "choice": [ "房产", "汽车", "教育", "军事" ], "answer": "汽车", "label": 1, "id": 7759 } ``` ### 情感分析 ```json { "texta": "就是废物,充电不进害得老子把主板烧了,客服不耐烦", "textb": "", "question": "", "choice": ["这是一条差评", "这是一条好评"], "answer": "这是一条差评", "label": 0, "id": 31 } ``` ### 语义匹配 ```json { "texta": "不要借了我是试试看能否操作的", "textb": "", "question": "", "choice": ["不能理解为:借款审核期间能否取消借款", "可以理解为:借款审核期间能否取消借款"], "answer": "不能理解为:借款审核期间能否取消借款", "label": 0, "id": 0 } ``` ### 自然语言推理 ```json { "texta": "身上裹一件工厂发的棉大衣,手插在袖筒里", "textb": "", "question": "", "choice": ["不能推断出:身上至少一件衣服", "很难推断出:身上至少一件衣服", "可以推断出:身上至少一件衣服"], "answer": "可以推断出:身上至少一件衣服", "label": 2, "id": 0 } ``` ## Citation 如果你觉得本仓库帮助到了你,你可以使用下面方式引用我们的工作 ```text @article{unimc, author = {Ping Yang and Junjie Wang and Ruyi Gan and Xinyu Zhu and Lin Zhang and Ziwei Wu and Xinyu Gao and Jiaxing Zhang and Tetsuya Sakai}, title = {Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective}, journal = {CoRR}, volume = {abs/2210.08590}, year = {2022} } ``` ## License [Apache License 2.0](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/LICENSE)