Papers - Multilingual - Encoders - Bytes Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Training - Bytes - Dynamic Patch Sizes Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Text - Dataset - Classification - Multitask - MMLU Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Text - Dataset - Coding - MBPP Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Text - Eval - Coding - Python Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Embeddings - Bytes - BPB - Larger Patches than BPE Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Text - Dataset - Datacomp-LM Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Embeddings - Bytes - Tokenizer Free Collection by matlok Dec 25, 2024 - MrT5: Dynamic Token Merging for Efficient Byte-level Language Models Paper • 2410.20771 • Published Oct 28, 2024 • 3 Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models Paper • 2410.20771 • Published Oct 28, 2024 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Training - Text - Datasets - Coding - GitHub Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Papers - Text - Character Level Transformers Collection by matlok Dec 25, 2024 - Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89