metadata
base_model:
- nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
- meta-llama/Llama-3.1-70B-Instruct
- ValiantLabs/Llama3.1-70B-ShiningValiant2
language:
- en
library_name: transformers
license: llama3.1
tags:
- llama
- llama3.1
- llama3
- meta
- 70b
- science
- physics
- biology
- chemistry
- compsci
- computer-science
- engineering
- logic
- rationality
- advanced
- expert
- technical
- conversational
- chat
- instruct
- mergekit
- merge
pipeline_tag: text-generation
model_type: llama
model-index:
- name: sequelbox/Llama3.1-70B-PlumChat
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-Shot)
type: Winogrande
args:
num_few_shot: 5
metrics:
- type: acc
value: 85
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: ARC Challenge (25-Shot)
type: arc-challenge
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 67.41
name: normalized accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU College Biology (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 93.75
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU High School Biology (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 91.94
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU Conceptual Physics (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 82.13
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU College Physics (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 60.78
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU High School Physics (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 62.25
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU College Chemistry (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 56
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU High School Chemistry (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 73.4
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU Astronomy (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 89.47
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU College Computer Science (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 64
name: acc
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU High School Computer Science (5-Shot)
type: MMLU
args:
num_few_shot: 5
metrics:
- type: acc
value: 90
name: acc
PlumChat 70b
This is a merge of pre-trained language models created using mergekit.
Merge Details
Shining Valiant 2 + Nemotron for high quality general chat, science-instruct, and complex query performance.
Merge Method
This model was merged using the della merge method using meta-llama/Llama-3.1-70B-Instruct as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
merge_method: della
dtype: bfloat16
parameters:
normalize: true
models:
- model: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
parameters:
density: 0.5
weight: 0.3
- model: ValiantLabs/Llama3.1-70B-ShiningValiant2
parameters:
density: 0.5
weight: 0.25
base_model: meta-llama/Llama-3.1-70B-Instruct