Edit model card

飞雪连天射白鹿,笑书神侠倚碧鸳

Model description

AI生成金庸小说,给出开头续写。

How to use

使用 pipeline 调用模型:

>>> # 调用微调后的模型
>>> senc="这些雪花落下来,多么白,多么好看.过几天太阳出来,每一片 雪花都变得无影无踪.到得明年冬天,又有许很多多雪花,只不过已不是 今年这些雪花罢了。"
>>> model_id="jinyong-gpt2-finetuning"
>>> from transformers import AutoTokenizer, GPT2LMHeadModel, TextGenerationPipeline

>>> tokenizer = AutoTokenizer.from_pretrained(model_id) 
>>> model = GPT2LMHeadModel.from_pretrained(model_id)
>>> text_generator = TextGenerationPipeline(model, tokenizer)   
>>> text_generator.model.config.pad_token_id = text_generator.model.config.eos_token_id
>>> text_generator( senc,max_length=108, do_sample=True)
[{'generated_text': '这些雪花落下来,多么白,多么好看.过几天太阳出来,每一片 雪花都变得无影无踪.到得明年冬天,又有许很多多雪花,只不过已不是 今年这些雪花罢了。 反正 老天爷 有眼 , 不知 哪里 是甚么 风 险 ?” 正 说到此处 , 突然 听得 谢逊 啸声 渐近 , 忍不住 张口 惊呼 , 一齐 向他 扑去 , 只听 谢逊 一声 怒吼 , 跟着 左手 用力 拍 出一掌 , 以 掌力 化开 。 众人 吃了一惊 , 同时 从 海 道 中 跃出 , 双双 倒退 。 张翠山和殷素素 对望一眼 , 均想 以 这两 大高手 之力 如何 抵挡 , 以 今日 之力 如何 攻敌 之'}]
>>> 

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("supermy/jinyong-gpt2")

model = AutoModelForCausalLM.from_pretrained("supermy/jinyong-gpt2")

Training data

此数据集基于金庸的【飞雪连天射白鹿,笑书神侠倚碧鸳】小说集训练。

统计信息


Training procedure

基于模型:GPT2 训练环境:英伟达16G显卡

bpe分词:"vocab_size"=30000

[INFO|trainer.py:1608] 2022-12-02 19:52:59,024 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-02 19:52:59,024 >>   Num examples = 9443
[INFO|trainer.py:1610] 2022-12-02 19:52:59,024 >>   Num Epochs = 108
[INFO|trainer.py:1611] 2022-12-02 19:52:59,024 >>   Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-02 19:52:59,024 >>   Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-02 19:52:59,024 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-02 19:52:59,024 >>   Total optimization steps = 84996
[INFO|trainer.py:1616] 2022-12-02 19:52:59,025 >>   Number of trainable parameters = 124439808

[INFO|trainer.py:1608] 2022-12-03 21:44:00,182 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-03 21:44:00,182 >>   Num examples = 9443
[INFO|trainer.py:1610] 2022-12-03 21:44:00,182 >>   Num Epochs = 216
[INFO|trainer.py:1611] 2022-12-03 21:44:00,182 >>   Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-03 21:44:00,182 >>   Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-03 21:44:00,182 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-03 21:44:00,182 >>   Total optimization steps = 169992
[INFO|trainer.py:1616] 2022-12-03 21:44:00,183 >>   Number of trainable parameters = 124439808
[INFO|trainer.py:1637] 2022-12-03 21:44:00,184 >>   Continuing training from checkpoint, will skip to saved global_step
[INFO|trainer.py:1638] 2022-12-03 21:44:00,184 >>   Continuing training from epoch 107
[INFO|trainer.py:1639] 2022-12-03 21:44:00,184 >>   Continuing training from global step 84500

[INFO|trainer.py:1608] 2022-12-05 07:36:13,626 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-05 07:36:13,626 >>   Num examples = 9443
[INFO|trainer.py:1610] 2022-12-05 07:36:13,626 >>   Num Epochs = 368
[INFO|trainer.py:1611] 2022-12-05 07:36:13,626 >>   Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-05 07:36:13,626 >>   Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-05 07:36:13,626 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-05 07:36:13,626 >>   Total optimization steps = 289616
[INFO|trainer.py:1616] 2022-12-05 07:36:13,627 >>   Number of trainable parameters = 124439808
[INFO|trainer.py:1637] 2022-12-05 07:36:13,628 >>   Continuing training from checkpoint, will skip to saved global_step
[INFO|trainer.py:1638] 2022-12-05 07:36:13,628 >>   Continuing training from epoch 255
[INFO|trainer.py:1639] 2022-12-05 07:36:13,628 >>   Continuing training from global step 201000

{'loss': 8.0431, 'learning_rate': 4.970998635229893e-05, 'epoch': 0.64}
{'loss': 7.4867, 'learning_rate': 4.94158548637583e-05, 'epoch': 1.27}
{'loss': 7.322, 'learning_rate': 4.912172337521766e-05, 'epoch': 1.91}
......
{'loss': 3.901, 'learning_rate': 2.5010882865076008e-05, 'epoch': 108.01}
{'loss': 3.8959, 'learning_rate': 2.4863817120805686e-05, 'epoch': 108.64}
......
{'loss': 3.1625, 'learning_rate': 4.6090404254317857e-07, 'epoch': 214.1}
{'loss': 3.1592, 'learning_rate': 3.1413242976140055e-07, 'epoch': 214.74}
{'loss': 3.1625, 'learning_rate': 1.6706668549108195e-07, 'epoch': 215.37}
{'train_runtime': 72271.9602, 'train_samples_per_second': 28.222, 'train_steps_per_second': 2.352, 'train_loss': 1.7180436183842016, 'epoch': 216.0}
{'loss': 2.7087, 'learning_rate': 4.2642671675598036e-08, 'epoch': 367.85}
{'train_runtime': 74859.0808, 'train_samples_per_second': 46.421, 'train_steps_per_second': 3.869, 'train_loss': 0.8725239146935282, 'epoch': 368.0}
***** train metrics *****
  epoch                    =       368.0
  train_loss               =      0.8725
  train_runtime            = 20:47:39.08
  train_samples            =        9443
  train_samples_per_second =      46.421
  train_steps_per_second   =       3.869
12/06/2022 04:23:55 - INFO - __main__ - *** Evaluate ***
[INFO|trainer.py:2929] 2022-12-06 04:23:55,953 >> ***** Running Evaluation *****
[INFO|trainer.py:2931] 2022-12-06 04:23:55,953 >>   Num examples = 283
[INFO|trainer.py:2934] 2022-12-06 04:23:55,954 >>   Batch size = 12
100%|██████████| 24/24 [00:07<00:00,  3.20it/s]
[INFO|modelcard.py:449] 2022-12-06 04:24:04,760 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}, 'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.19599206157122803}]}
***** eval metrics *****
  epoch                   =      368.0
  eval_accuracy           =      0.196
  eval_loss               =     7.9524
  eval_runtime            = 0:00:07.87
  eval_samples            =        283
  eval_samples_per_second =      35.94
  eval_steps_per_second   =      3.048
  perplexity              =  2842.2766
Downloads last month
3,835
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using supermy/jinyong-gpt2 1