Introducing UNA-ThePitbull Series

Community Article Published June 1, 2024

UNA - ThePitbull 21.4B v2 We are happy to announce the release of our latest model UNA-ThePitbull, the most powerful model below 70B in the industry. In this new generation, inspired on our previous Beagle series we curated a model that balance nicely EQ and IQ. It was trained with some of the latest datasets including:

  • Replete-AI/code_bagel_hermes-2.5
  • mlabonne/orpo-dpo-mix-40k
  • jondurbin/py-dpo-v0.1

Available in the hub fblgit/UNA-ThePitbull-21.4B-v2 and you can grab Quant versions sponsored by @bartowski at bartowski/UNA-ThePitbull-21.4B-v2-GGUF fully compatible with Ollama, llama.cpp, etc.

Evaluations

Detailed Evaluation results can be found here

Metric Value
Avg. 77.82
AI2 Reasoning Challenge (25-Shot) 77.73
HellaSwag (10-Shot) 91.79
MMLU (5-Shot) 68.25
TruthfulQA (0-shot) 78.24
Winogrande (5-shot) 87.37
GSM8k (5-shot) 63.53

UNA

In this case we tried something new by alternating uniformity across layers of both MLP & Attention reducing computational requirements while keep a high performant result.

Bonus

We trained him under these terms:

  • ThePitbull-v1 as base: SFT maxLR 1e-4 minLR 5e-5 for 1 Epoch
  • DPO maxLR 1e-4 minLR 5e-5 for 1 Epoch

You can continue the training by merely using 5e-5 maxLR and 0 warmup steps, it should minimize catastrophic forgetting of the model.

Remember if you do so, please include a Pitbull picture on your model and cite :) Have fun!