leaderboard-pr-bot's picture
Adding Evaluation Results
57d1ad3
|
raw
history blame
1.34 kB
metadata
license: apache-2.0
datasets:
  - totally-not-an-llm/EverythingLM-data-V3
language:
  - en
library_name: transformers

Trained on 2 epochs on the EverythingLM-data-V3 dataset.

This model uses the alpaca prompt format:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Instruction

### Input:
Input

### Response:

Built with Axolotl

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 35.58
ARC (25-shot) 42.75
HellaSwag (10-shot) 71.72
MMLU (5-shot) 27.16
TruthfulQA (0-shot) 34.26
Winogrande (5-shot) 66.3
GSM8K (5-shot) 1.52
DROP (3-shot) 5.35