|
[**中文**](./README.md) | [**English**](./README_en.md) |
|
# UniMC |
|
|
|
EMNLP 2022 论文 《[Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective](https://arxiv.org/abs/2210.08590)》源码 |
|
|
|
 |
|
|
|
## Update |
|
- [2022-10-18] Release preprint in arXiv. |
|
- [2022-10-14] Release code in GitHub. |
|
|
|
## Requirements |
|
|
|
安装 fengshen 框架 |
|
|
|
```shell |
|
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git |
|
cd Fengshenbang-LM |
|
pip install --editable . |
|
``` |
|
|
|
## Quick Start |
|
|
|
你可以参考我们的 [example.py](./example.py) 脚本,只需要将处理好的 train、dev、test 即输入模型即可。 |
|
```python |
|
import argparse |
|
from fengshen.pipelines.multiplechoice import UniMCPipelines |
|
|
|
total_parser = argparse.ArgumentParser("TASK NAME") |
|
total_parser = UniMCPipelines.piplines_args(total_parser) |
|
args = total_parser.parse_args() |
|
|
|
pretrained_model_path = 'IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese' |
|
args.learning_rate=2e-5 |
|
args.max_length=512 |
|
args.max_epochs=3 |
|
args.batchsize=8 |
|
args.default_root_dir='./' |
|
model = UniMCPipelines(args,model_path=pretrained_model_path) |
|
|
|
train_data = [] |
|
dev_data = [] |
|
test_data = [{ |
|
"texta": "就是废物,充电不进害得老子把主板烧了,客服不耐烦", |
|
"textb": "", |
|
"question": "", |
|
"choice": ["这是一条差评", "这是一条好评"], |
|
"answer": "这是一条差评", |
|
"label": 0, |
|
"id": 31 |
|
}] |
|
|
|
if args.train: |
|
model.train(train_data, dev_data) |
|
result = model.predict(test_data) |
|
``` |
|
## Pretrained Model |
|
对于英文模型,我们使用14份 multiplechoice 数据集进行了预训练。在中文模型中,我们已经收集了48份数据集对模型进行预训练,我们已经将预训练模型开源到 HuggingFace 社区当中。 |
|
|
|
| 模型 | 地址 | |
|
|:---------:|:--------------:| |
|
| Erlangshen-UniMC-Albert-235M-English | [https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English) | |
|
| Erlangshen-UniMC-RoBERTa-110M-Chinese | [https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | |
|
| Erlangshen-UniMC-RoBERTa-330M-Chinese | [https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | |
|
| Erlangshen-UniMC-MegatronBERT-1.3B-Chinese | [https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | |
|
|
|
## Experiments |
|
|
|
|
|
### English |
|
|
|
为了测评 UniMC 的性能,在英文中,我们使用 14份 multiple-choice 数据集(具体数据参考原论文)来对模型进行预训练,使其具备做选择题的能力, |
|
|
|
**Zero-shot** |
|
| Model | T0 11B | GLaM 60B | FLAN 137B | PaLM 540B | UniMC 235M | |
|
|---------|--------|----------|-----------|-----------|------------| |
|
| ANLI R1 | 43.6 | 40.9 | 47.7 | 48.4 | **52.0** | |
|
| ANLI R2 | 38.7 | 38.2 | 43.9 | 44.2 | **44.4** | |
|
| ANLI R3 | 41.3 | 40.9 | 47.0 | 45.7 | **47.8** | |
|
| CB | 70.1 | 33.9 | 64.1 | 51.8 | **75.7** | |
|
### Chinese |
|
|
|
为了测评 UniMC 在中文场景下的性能我们使用 13份 有监督数据集来对模型进行预训练,预训练数据如下: |
|
| Task type | Task | # of option | Data size | |
|
|---------|--------|----------|-----------| |
|
| Multiple-choice | c3 | 4 | 11.8k | |
|
| Multiple-choice | ClozeT | 2 | 0.7k | |
|
| Multiple-choice | CMRC2019 | n | 11.4k | |
|
| Multiple-choice | GCRC | 4 | 7.8k | |
|
| Classification | DuEE-Fin | 12 | 4.3k | |
|
| Classification | DuEE1.0 | 65 | 10.3k | |
|
| Classification | Fudan | 20 | 19.6k | |
|
| Classification | THUNEWS | 10 | 180k | |
|
| NLI | CMNLI | 3 | 39k | |
|
| NLI | SNLI | 3 | 545.8k | |
|
| Paraphrace | AFQMC | 2 | 34.3k | |
|
| Paraphrace | PAWS-X | 2 | 49k | |
|
| Paraphrace | STS-B | 2 | 80k | |
|
|
|
我们使用中文领域常用的benchmark来测试UniMC的性能,具体是FewCLUE的9个任务,我们在 test_public 上测评模型的性能。 |
|
|
|
|
|
**Few-shot** |
|
| Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg | |
|
|------------|------------|----------|-----------|----------|-----------|-----------|-----------|----------|-----------|-----------| |
|
| Finetuning | 65.4 | 35.5 | 49 | 32.8 | 33 | 60.7 | 14.9 | 50 | 55.6 | 44.1 | |
|
| PET | 86.7 | 51.7 | 54.5 | 46 | 44 | 56 | 61.2 | 59.4 | 57.5 | 57.44 | |
|
| LM-BFF | 85.6 | 54.4 | 53 | 47.1 | 41.6 | 57.6 | 61.2 | 51.7 | 54.7 | 56.32 | |
|
| P-tuning | 88.3 | 56 | 54.2 | **57.6** | 41.9 | 60.9 | 59.3 | **62.9** | 58.1 | 59.91 | |
|
| EFL | 84.9 | 45 | 52.1 | 42.7 | 66.2 | 71.8 | 30.9 | 56.6 | 53 | 55.91 | |
|
| [UniMC-RoBERTa-110M](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 88.64 | 54.08 | 54.32 | 48.6 | 66.55 | 73.76 | 67.71 | 52.54 | 59.92 | 62.86 | |
|
| [UniMC-RoBERTa-330M](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 89.53 | 57.3 | 54.25 | 50 | 70.59 | 77.49 | 78.09 | 55.73 | 65.16 | 66.46 | |
|
| [UniMC-MegatronBERT-1.3B](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **89.278** | **60.9** | **57.46** | 52.89 | **76.33** | **80.37** | **90.33** | 61.73 | **79.15** | **72.05** | |
|
|
|
**Zero-shot** |
|
|
|
| Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg | |
|
|---------------|-----------|-----------|-----------|-----------|-----------|----------|----------|----------|-----------|-----------| |
|
| GPT-zero | 57.5 | 26.2 | 37 | 19 | 34.4 | 50 | 65.6 | 50.1 | 50.3 | 43.4 | |
|
| PET-zero | 85.2 | 12.6 | 26.1 | 26.6 | 40.3 | 50.6 | 57.6 | 52.2 | 54.7 | 45.1 | |
|
| NSP-BERT | 86.9 | 47.6 | 51 | 41.6 | 37.4 | 63.4 | 52 | **64.4** | 59.4 | 55.96 | |
|
| ZeroPrompt | - | - | - | 16.14 | 46.16 | - | - | - | 47.98 | - | |
|
| Yuan1.0-13B | 88.13 | 38.99 | 57.47 | 38.82 | 48.13 | 59.38 | 86.14 | 50 | 38.99 | 56.22 | |
|
| ERNIE3.0-240B | 88.75 | **50.97** | **57.83** | **40.42** | 53.57 | 64.38 | 87.13 | 56.25 | 53.46 | 61.41 | |
|
| [UniMC-RoBERTa-110M](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 86.16 | 31.26 | 46.61 | 26.54 | 66.91 | 73.34 | 66.68 | 50.09 | 53.66 | 55.7 | |
|
| [UniMC-RoBERTa-330M](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 87.5 | 30.4 | 47.6 | 31.5 | 69.9 | 75.9 | 78.17 | 49.5 | 60.55 | 59.01 | |
|
| [UniMC-MegatronBERT-1.3B](https://huggingface.co./IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **88.79** | 42.06 | 55.21 | 33.93 | **75.57** | **79.5** | **89.4** | 50.25 | **66.67** | **64.53** | |
|
|
|
|
|
|
|
## Dataset |
|
|
|
我们已经定义好了 UniMC 所需的数据格式,你只需要将数据转化为下面的数据格式即可: |
|
|
|
### 文本分类 |
|
```json |
|
{ |
|
"texta": "街头偶遇2018款长安CS35,颜值美炸!或售6万起,还买宝骏510?", |
|
"textb": "", |
|
"question": "下面新闻属于哪一个类别?", |
|
"choice": [ |
|
"房产", |
|
"汽车", |
|
"教育", |
|
"军事" |
|
], |
|
"answer": "汽车", |
|
"label": 1, |
|
"id": 7759 |
|
} |
|
|
|
``` |
|
|
|
### 情感分析 |
|
```json |
|
{ |
|
"texta": "就是废物,充电不进害得老子把主板烧了,客服不耐烦", |
|
"textb": "", |
|
"question": "", |
|
"choice": ["这是一条差评", "这是一条好评"], |
|
"answer": "这是一条差评", |
|
"label": 0, |
|
"id": 31 |
|
} |
|
|
|
``` |
|
|
|
### 语义匹配 |
|
```json |
|
{ |
|
"texta": "不要借了我是试试看能否操作的", |
|
"textb": "", |
|
"question": "", |
|
"choice": ["不能理解为:借款审核期间能否取消借款", "可以理解为:借款审核期间能否取消借款"], |
|
"answer": "不能理解为:借款审核期间能否取消借款", |
|
"label": 0, |
|
"id": 0 |
|
} |
|
|
|
``` |
|
|
|
### 自然语言推理 |
|
```json |
|
{ |
|
"texta": "身上裹一件工厂发的棉大衣,手插在袖筒里", |
|
"textb": "", |
|
"question": "", |
|
"choice": ["不能推断出:身上至少一件衣服", "很难推断出:身上至少一件衣服", "可以推断出:身上至少一件衣服"], |
|
"answer": "可以推断出:身上至少一件衣服", |
|
"label": 2, |
|
"id": 0 |
|
} |
|
|
|
``` |
|
|
|
|
|
## Citation |
|
如果你觉得本仓库帮助到了你,你可以使用下面方式引用我们的工作 |
|
|
|
```text |
|
@article{unimc, |
|
author = {Ping Yang and |
|
Junjie Wang and |
|
Ruyi Gan and |
|
Xinyu Zhu and |
|
Lin Zhang and |
|
Ziwei Wu and |
|
Xinyu Gao and |
|
Jiaxing Zhang and |
|
Tetsuya Sakai}, |
|
title = {Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective}, |
|
journal = {CoRR}, |
|
volume = {abs/2210.08590}, |
|
year = {2022} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
[Apache License 2.0](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/LICENSE) |
|
|
|
|