Introducing UNA-ThePitbull Series

Community Article Published June 1, 2024

We are happy to announce the release of our latest model UNA-ThePitbull, the most powerful model below 70B in the industry. In this new generation, inspired on our previous Beagle series we curated a model that balance nicely EQ and IQ. It was trained with some of the latest datasets including:

Replete-AI/code_bagel_hermes-2.5
mlabonne/orpo-dpo-mix-40k
jondurbin/py-dpo-v0.1

Available in the hub fblgit/UNA-ThePitbull-21.4B-v2 and you can grab Quant versions sponsored by @bartowski at bartowski/UNA-ThePitbull-21.4B-v2-GGUF fully compatible with Ollama, llama.cpp, etc.

Evaluations

Detailed Evaluation results can be found here

Metric	Value
Avg.	77.82
AI2 Reasoning Challenge (25-Shot)	77.73
HellaSwag (10-Shot)	91.79
MMLU (5-Shot)	68.25
TruthfulQA (0-shot)	78.24
Winogrande (5-shot)	87.37
GSM8k (5-shot)	63.53

UNA

In this case we tried something new by alternating uniformity across layers of both MLP & Attention reducing computational requirements while keep a high performant result.

Bonus

We trained him under these terms:

ThePitbull-v1 as base: SFT maxLR 1e-4 minLR 5e-5 for 1 Epoch
DPO maxLR 1e-4 minLR 5e-5 for 1 Epoch

You can continue the training by merely using 5e-5 maxLR and 0 warmup steps, it should minimize catastrophic forgetting of the model.

Remember if you do so, please include a Pitbull picture on your model and cite :) Have fun!

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote