README.md · xiaol/rwkv-7B-world-novel-128k at 0e6972bc0e7d4a92610df9f4740489b9f7b506f2

metadata

license: apache-2.0

We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

This model trained with instructions datasets and chinese web novel and tradition wuxia, more trainning details would be updated.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens. https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh