Wenzhong-GPT2-110M / README.md
Zimix's picture
Update README.md
666f5ab
|
raw
history blame
1.78 kB
metadata
language:
  - zh
inference:
  parameters:
    max_new_tokens: 128
    repetition_penalty: 25
    top_p: 0.9
    do_sample: true
license: apache-2.0

Wenzhong-GPT2-110M model (chinese),one model of Fengshenbang-LM.

Wenzhong-GPT2-110M is one of the Wenzhong series, which has smaller parameters. Wenzhong-GPT2-110M Is the base version of gpt2。

Usage

load model

from transformers import GPT2Tokenizer,GPT2LMHeadModel
hf_model_path = 'IDEA-CCNL/Wenzhong-GPT2-110M'
tokenizer = GPT2Tokenizer.from_pretrained(hf_model_path)
model = GPT2LMHeadModel.from_pretrained(hf_model_path)

generation

question = "北京是中国的"
inputs = tokenizer(question,return_tensors='pt')
generation_output = model.generate(**inputs,
                                return_dict_in_generate=True,
                                output_scores=True,
                                max_length=150,
                                # max_new_tokens=80,
                                do_sample=True,
                                top_p = 0.6,
                                # num_beams=5,
                                eos_token_id=50256,
                                pad_token_id=0,
                                num_return_sequences = 5)

for idx,sentence in enumerate(generation_output.sequences):
    print('next sentence %d:\n'%idx,
    tokenizer.decode(sentence).split('<|endoftext|>')[0])
    print('*'*40)

Citation

If you find the resource is useful, please cite the following website in your paper.

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}