--- license: apache-2.0 datasets: - Norquinal/claude_multiround_chat_30k - OpenLeecher/Teatime --- We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10. With RWKV world tokenizer,multi-langs have 1:1 tokenization ratio ,one word to one token. (https://github.com/BlinkDL/ChatRWKV/blob/2a13ddecd81f8fd615b6da3a8f1091a594689e30/tokenizer/rwkv_tokenizer.py#L163) This model trained with instructions datasets and chinese web novel and tradition wuxia, more trainning details would be updated. Tested to summary 85k tokens to 5 keypoints ,can find conversation files in example folders ,more cases are coming. Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens. https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh ![QQ图片20230810153529.jpg](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/d8ekmc4Lfhy2lYEdrRKXz.jpeg) Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this model, only need 16G vram to run fp16 or 8G vram fp16i8, use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively. ![微信截图_20230810162303.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/Ww45-WMngl4Jyt1OZDAa_.png) ![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5zDQVbGb-fX8Y8h98tUF0.png) ![微信截图_20230810142220.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/u2wA-l1UcW-Mt9KIoa_4q.png) ![4UYBX4RA0%8PA{1YSSK)AVW.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/gzr8Yt4JRkBz31-ieRSOE.png) ![QQ图片20230810143840.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/LgEjfHJ7XD7PlGM9b3RAf.png) ![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/b_6KCBdZKW7Q7HwipxE-l.png) 85k tokens test ![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/F9unOJfhmJPXsciPHLsrl.png) ![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/35j5C1QD_4cO-AjfxV7tl.png) ![微信截图_20230810201844.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5dVQrJxg05C0ww7_AVhrW.png)