Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 93
Adaptive Length Image Tokenization via Recurrent Allocation Paper • 2411.02393 • Published Nov 4, 2024 • 12
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14, 2024 • 59