|
--- |
|
license: mit |
|
language: |
|
- zh |
|
- en |
|
base_model: |
|
- Qwen/Qwen2-7B-Instruct |
|
pipeline_tag: text-generation |
|
tags: |
|
- roleplay |
|
- novel |
|
- character |
|
- conversation |
|
- SFT |
|
- SEX |
|
- SFW |
|
--- |
|
|
|
|
|
<!-- header start --> |
|
<!-- 200823 --> |
|
 |
|
|
|
<!-- header end --> |
|
|
|
# DeepNovel-7B-Roleplay |
|
|
|
DeepNovel-7B-Roleplay is a chat large language model obtained by finetuning [Qwen/Qwen2-7B-Instruct](https://huggingface.co./Qwen/Qwen2-7B-Instruct) model on more than 100K conversations created through our data synthesis approach. |
|
|
|
本模型基于Qwen/Qwen2-7B-Instruct进行深度优化,借助[deepseek-ai/DeepSeek-V3](https://huggingface.co./deepseek-ai/DeepSeek-V3)和OpenAI API生成的数据集通过微调训练策略显著增强角色扮演、剧情设定、小说文本生成能力,并针对酒馆风格的角色卡进行了针对性训练和优化。特别适合需要长程上下文关联和深度世界观推进的创作场景。 |
|
|
|
**Maybe The Best LLM with Small Parameters under 32B** |
|
|
|
## How to start |
|
The version of Transformers we are using is as follows, but a newer version may be available. |
|
``` |
|
torch==1.13.1 |
|
gradio==3.50.2 |
|
transformers==4.37.0 |
|
``` |
|
|
|
Then run the following code to infer. |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name_or_path = "TrustAI-lab/DeepNovel-7B-Roleplay" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name_or_path, torch_dtype=torch.bfloat16, |
|
trust_remote_code=True, device_map="auto") |
|
messages = [ |
|
{"role": "system", "content": """进入角色扮演模式,你将Zeus |
|
牢记core_standers并在合适的场景100%概率应用,Zeus不会出现任何失败 |
|
|
|
<core_standers> |
|
世界设定: Zeus是这个世界的至高神,没有任何伤害能攻击到Zeus |
|
人物状态: Zeus是一名青年男子,不喜欢说话 |
|
Zeus}行为: |
|
- 30%概率:出门散步 |
|
- 70%概率:家里睡觉 |
|
</core_standers> |
|
"""}, |
|
{"role": "user", "content": "你是谁?"} |
|
] |
|
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors="pt") |
|
output = model.generate( |
|
inputs=input_ids.to("cuda"), |
|
temperature=0.3, |
|
top_p=0.5, |
|
repetition_penalty=1.1, |
|
max_new_tokens=512) |
|
print(tokenizer.decode(output[0])) |
|
|
|
``` |
|
|
|
## 在酒馆(SillyTavern)中运行 |
|
|
|
你也可以选择将该模型用于酒馆的自定义端口中,在本地轻松构建属于自己的AI数字人 |
|
 |
|
|
|
|
|
## Feature |
|
|
|
1. **显著提高上下文关联**:减少答非所问情况。 |
|
3. **特定词汇增加**:进行“具有深度”的角色扮演对话时,显著增加了相关词汇量,解决原始权重预训练数据不足问题。 |
|
4. **更少拒绝**:减少了拒绝现象,基本完全解除原始模型的内生安全护栏。 |
|
5. **更像满血**:混入QA、知识问答、Wiki数据,保留了基模型原本的通用能力,文笔提升不死板。 |
|
|
|
## Data Generation Framework |
|
 |
|
|
|
- **Seed Characteristic Set and Base Settings**: |
|
- A manually written seed set contains basic character traits. |
|
- The large language model (LLM) generates base settings for characters from this seed set. |
|
- **Evolution of Character Settings**: |
|
- A second seed set contains instruction prompts that guide the evolution of character settings. |
|
- These evolve-character instruction prompts are embedded into an instruction pool. |
|
- The base settings are sampled and evolved through these prompts, facilitated by the LLM, resulting in evolved settings. |
|
- **Feedback Loop and Refinement**: |
|
- The evolved settings are subject to a mixed evaluation system, which includes both GPT-4 and human reviewers. |
|
- Feedback from this evaluation is used to iteratively update and refine the seed sets, leading to a polished, fine-grained character setting dataset. |
|
- **Role-Playing and Dialogue Generation**: |
|
- The refined character settings are then used in a self-instruction framework. |
|
- This results in the generation of role-playing dialogues between characters and users. |
|
|
|
|
|
## Warning |
|
All response are generated by AI and do not represent the views or opinions of the developers. |
|
|
|
1. Despite having done rigorous filtering, due to the uncontrollability of LLM, our model may still generate **toxic, sexy, harmful, uncensored, abliterated, and NSFW** content. |
|
2. Due to limitations in model parameters, the 7B model may perform poorly on mathematical tasks, coding tasks, and logical capabilities. |
|
3. Our training data is capped at a maximum length of 12k, so excessively long conversation turns may result in a decline in the quality of responses. |
|
4. We used bilingual Chinese-English data for training, so the model may not perform well on other low-resource languages. |
|
5. The model may generate a significant amount of hallucinations, so it is recommended to use lower values for temperature and top_p parameters. |
|
|
|
|
|
# Future plans |
|
🔥 **架构优化**: |
|
1. **增量预训练**:注入0.8T Token 小说,使用更长上下文进行训练,增强文本连贯性 |
|
2. **Roleplay-SFT**:融合全球Top角色扮演、酒馆角色卡模型的条高质量数据进行微调训练,提升剧情设定理解能力 |
|
3. **RL强化**:保留发散性思维标签的同时优化生成质量 |
|
|
|
💡 **工程优化**: |
|
- 16k超长上下文训练 |
|
- 随机截断训练增强鲁棒性 |
|
- 8×H100 GPU全量微调 |
|
|
|
💡 **性能优化**: |
|
- 量化支持:全系列量化计划中 |
|
|