This is the first version of upscaling llama-3. Version 2 is now out and does not have any of the issues that this version has. Please use version 2 instead. Linked bellow:
Llama-3-13B
Thank you to Meta for the weights for Meta-Llama-3-8B
This is an upscaling of the Llama-3-8B Ai using techniques created for Mistral-Evolved-11b-v0.1. This Ai model has been upscaled from 8b parameters to 13b parameters without any continuous pretraining or fine-tuning.
From testing, the model seems to function perfectly at fp16, but has some issues at 4-bit quantization using bitsandbytes.
The model that was used to create this one is linked below:
https://huggingface.co./meta-llama/Meta-Llama-3-8B
- Llama-3-13B
Metric | Value |
---|---|
Avg. | 54.61 |
AI2 Reasoning Challenge (25-Shot) | 52.99 |
HellaSwag (10-Shot) | 80.66 |
MMLU (5-Shot) | 62.12 |
TruthfulQA (0-shot) | 39.28 |
Winogrande (5-shot) | 70.72 |
GSM8k (5-shot) | 21.91 |
- Original Meta-Llama-3-8B
Metric | Value |
---|---|
Avg. | 62.87 |
AI2 Reasoning Challenge (25-Shot) | 59.47 |
HellaSwag (10-Shot) | 82.09 |
MMLU (5-Shot) | 66.69 |
TruthfulQA (0-shot) | 43.90 |
Winogrande (5-shot) | 77.35 |
GSM8k (5-shot) | 45.34 |
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.