🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
BEEspoke Data
community
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Organization Card
🐝📊💁
Collections
7
smol_llama 220M fine-tunes we did
-
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation • Updated • 1.37k • 5 -
BEE-spoke-data/smol_llama-220M-open_instruct
Text Generation • Updated • 23 • 1 -
BEE-spoke-data/beecoder-220M-python
Text Generation • Updated • 15 • 2 -
BEE-spoke-data/zephyr-220m-sft-full
Text Generation • Updated • 1.24k • 1
spaces
1
models
52
BEE-spoke-data/pegasus-x-base-synthsumm_open-16k
Summarization
•
Updated
•
110
BEE-spoke-data/tFINE-680m-e32-d16-gqa-flan
Text2Text Generation
•
Updated
•
59
BEE-spoke-data/tFINE-680m-e32-d16-infinity_instruct-L2
Text2Text Generation
•
Updated
•
18
BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e
Text2Text Generation
•
Updated
•
11
BEE-spoke-data/tFINE-900m-instruct-orpo
Text2Text Generation
•
Updated
•
60
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation
•
Updated
•
1.37k
•
5
BEE-spoke-data/tFINE-900m-e16-d32-instruct
Text2Text Generation
•
Updated
•
11
BEE-spoke-data/tFINE-900m-e16-d32-flan
Text2Text Generation
•
Updated
•
9
BEE-spoke-data/slimpajama_tok-48128-BPE-forT5
Updated
BEE-spoke-data/claude-tokenizer-forT5
Updated
datasets
71
BEE-spoke-data/roastme-processed-chunks
Updated
BEE-spoke-data/TxT360-5M-sample-en
Viewer
•
Updated
•
10M
•
192
•
2
BEE-spoke-data/TxT360-500k-sample-no_cc
Viewer
•
Updated
•
500k
•
40
BEE-spoke-data/TxT360-1M-sample
Viewer
•
Updated
•
1M
•
91
BEE-spoke-data/survivorslib-law-books
Viewer
•
Updated
•
49
•
52
BEE-spoke-data/roastme-filtered
Viewer
•
Updated
•
78.8k
•
52
BEE-spoke-data/taskweb
Viewer
•
Updated
•
1.05M
•
33
BEE-spoke-data/FLAN-compressed-plusplus
Viewer
•
Updated
•
124M
•
220
•
1
BEE-spoke-data/FLAN-compressed
Viewer
•
Updated
•
338M
•
102
•
1
BEE-spoke-data/synthsumm-comparisons
Viewer
•
Updated
•
4.67k
•
30