Llama3.2-3B-Enigma / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
0b48e7c verified
|
raw
history blame
5.98 kB
---
language:
- en
license: llama3.2
tags:
- enigma
- valiant
- valiant-labs
- llama
- llama-3.2
- llama-3.2-instruct
- llama-3.2-instruct-3b
- llama-3
- llama-3-instruct
- llama-3-instruct-3b
- 3b
- code
- code-instruct
- python
- conversational
- chat
- instruct
base_model: meta-llama/Llama-3.2-3B-Instruct
datasets:
- sequelbox/Tachibana
- sequelbox/Supernova
pipeline_tag: text-generation
model_type: llama
model-index:
- name: Llama3.2-3B-Enigma
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 47.75
name: strict accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 18.81
name: normalized accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 6.65
name: exact match
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 1.45
name: acc_norm
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 4.54
name: acc_norm
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 15.41
name: accuracy
source:
url: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.2-3B-Enigma
name: Open LLM Leaderboard
---
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/it7MY5MyLCLpFQev5dUis.jpeg)
Enigma is a code-instruct model built on Llama 3.2 3b.
- High quality code instruct performance with the Llama 3.2 Instruct chat format
- Finetuned on synthetic code-instruct data generated with Llama 3.1 405b. [Find the current version of the dataset here!](https://huggingface.co./datasets/sequelbox/Tachibana)
- Overall chat performance supplemented with [generalist synthetic data.](https://huggingface.co./datasets/sequelbox/Supernova)
## Version
This is the **2024-09-30** release of Enigma for Llama 3.2 3b, enhancing code-instruct and general chat capabilities.
Help us and recommend Enigma to your friends! We're excited for more Enigma releases in the future.
## Prompting Guide
Enigma uses the [Llama 3.2 Instruct](meta-llama/Llama-3.2-3B-Instruct) prompt format. The example script below can be used as a starting point for general chat:
```python
import transformers
import torch
model_id = "ValiantLabs/Llama3.2-3B-Enigma"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Enigma, a highly capable code assistant."},
{"role": "user", "content": "Can you explain virtualization to me?"}
]
outputs = pipeline(
messages,
max_new_tokens=1024,
)
print(outputs[0]["generated_text"][-1])
```
## The Model
Enigma is built on top of Llama 3.2 3b Instruct, using high quality code-instruct data and general chat data in Llama 3.2 Instruct prompt style to supplement overall performance.
Our current version of Enigma is trained on code-instruct data from [sequelbox/Tachibana](https://huggingface.co./datasets/sequelbox/Tachibana) and general chat data from [sequelbox/Supernova.](https://huggingface.co./datasets/sequelbox/Supernova)
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/VCJ8Fmefd8cdVhXSSxJiD.jpeg)
Enigma is created by [Valiant Labs.](http://valiantlabs.ca/)
[Check out our HuggingFace page for Shining Valiant 2 and our other Build Tools models for creators!](https://huggingface.co./ValiantLabs)
[Follow us on X for updates on our models!](https://twitter.com/valiant_labs)
We care about open source.
For everyone to use.
We encourage others to finetune further from our models.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_ValiantLabs__Llama3.2-3B-Enigma)
| Metric |Value|
|-------------------|----:|
|Avg. |15.77|
|IFEval (0-Shot) |47.75|
|BBH (3-Shot) |18.81|
|MATH Lvl 5 (4-Shot)| 6.65|
|GPQA (0-shot) | 1.45|
|MuSR (0-shot) | 4.54|
|MMLU-PRO (5-shot) |15.41|