File size: 1,186 Bytes
26c02f0
 
 
8014230
9d6957e
 
8014230
 
eb818d2
8014230
0e6972b
 
8014230
 
 
 
9d6957e
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: apache-2.0
---
We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

This model trained with instructions datasets and chinese web novel and tradition wuxia, 
more trainning details would be updated.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens.
https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

![QQ图片20230810153529.jpg](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/d8ekmc4Lfhy2lYEdrRKXz.jpeg)

Using RWKV Runner https://github.com/josStorer/RWKV-Runner  to test this 

![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5zDQVbGb-fX8Y8h98tUF0.png)

![微信截图_20230810142220.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/u2wA-l1UcW-Mt9KIoa_4q.png)

![4UYBX4RA0%8PA{1YSSK)AVW.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/gzr8Yt4JRkBz31-ieRSOE.png)

![QQ图片20230810143840.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/LgEjfHJ7XD7PlGM9b3RAf.png)