BEE-spoke-data/smol_llama-101M-GQA-GGUF

Quantized GGUF model files for smol_llama-101M-GQA from BEE-spoke-data

Name	Quant method	Size
smol_llama-101m-gqa.fp16.gguf	fp16	203.28 MB
smol_llama-101m-gqa.q2_k.gguf	q2_k	50.93 MB
smol_llama-101m-gqa.q3_k_m.gguf	q3_k_m	57.06 MB
smol_llama-101m-gqa.q4_k_m.gguf	q4_k_m	65.40 MB
smol_llama-101m-gqa.q5_k_m.gguf	q5_k_m	74.34 MB
smol_llama-101m-gqa.q6_k.gguf	q6_k	83.83 MB
smol_llama-101m-gqa.q8_0.gguf	q8_0	108.35 MB

Original Model Card:

smol_llama-101M-GQA

A small 101M param (total) decoder model. This is the first version of the model.

768 hidden size, 6 layers
GQA (24 heads, 8 key-value), context length 1024
train-from-scratch

Notes

This checkpoint is the 'raw' pre-trained model and has not been tuned to a more specific task. It should be fine-tuned before use in most cases.

Checkpoints & Links

smol-er 81M parameter checkpoint with in/out embeddings tied: here
Fine-tuned on pypi to generate Python code - link
For the chat version of this model, please see here

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	25.32
ARC (25-shot)	23.55
HellaSwag (10-shot)	28.77
MMLU (5-shot)	24.24
TruthfulQA (0-shot)	45.76
Winogrande (5-shot)	50.67
GSM8K (5-shot)	0.83
DROP (3-shot)	3.39

afrideva
/

smol_llama-101M-GQA-GGUF