|
--- |
|
language: |
|
- zh |
|
|
|
inference: |
|
parameters: |
|
max_new_tokens: 128 |
|
repetition_penalty: 25.0 |
|
top_p: 0.9 |
|
do_sample: True |
|
|
|
|
|
|
|
license: apache-2.0 |
|
--- |
|
# Wenzhong-GPT2-110M model (chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM). |
|
Wenzhong-GPT2-110M is one of the Wenzhong series, which has smaller parameters. Wenzhong-GPT2-110M Is the base version of gpt2。 |
|
|
|
## Usage |
|
|
|
### load model |
|
|
|
```python |
|
from transformers import GPT2Tokenizer,GPT2LMHeadModel |
|
hf_model_path = 'IDEA-CCNL/Wenzhong-GPT2-110M' |
|
tokenizer = GPT2Tokenizer.from_pretrained(hf_model_path) |
|
model = GPT2LMHeadModel.from_pretrained(hf_model_path) |
|
``` |
|
|
|
### generation |
|
|
|
```python |
|
question = "北京是中国的" |
|
inputs = tokenizer(question,return_tensors='pt') |
|
generation_output = model.generate(**inputs, |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
max_length=150, |
|
# max_new_tokens=80, |
|
do_sample=True, |
|
top_p = 0.6, |
|
# num_beams=5, |
|
eos_token_id=50256, |
|
pad_token_id=0, |
|
num_return_sequences = 5) |
|
|
|
for idx,sentence in enumerate(generation_output.sequences): |
|
print('next sentence %d:\n'%idx, |
|
tokenizer.decode(sentence).split('<|endoftext|>')[0]) |
|
print('*'*40) |
|
``` |
|
|
|
## Citation |
|
If you find the resource is useful, please cite the following website in your paper. |
|
``` |
|
@misc{Fengshenbang-LM, |
|
title={Fengshenbang-LM}, |
|
author={IDEA-CCNL}, |
|
year={2021}, |
|
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}}, |
|
} |
|
``` |