Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
BlinkDL 
posted an update 7 days ago
Post
1351
RWKV-7 "Goose" 0.4B trained w/ ctx4k automatically extrapolates to ctx32k+, and perfectly solves NIAH ctx16k 🤯 100% RNN and attention-free. Only trained on the Pile. No finetuning. Replicable training runs. tested by our community: https://github.com/Jellyfish042/LongMamba
In this post