|
--- |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Meta-Llama-3-8B-Instruct |
|
library_name: transformers |
|
tags: |
|
- meta |
|
- llama-3 |
|
- pytorch |
|
- mergekit |
|
- merge |
|
license: llama3 |
|
license_link: LICENSE |
|
pipeline_tag: text-generation |
|
widget: |
|
- example_title: Hello |
|
messages: |
|
- role: user |
|
content: Hey my name is Corwin! How are you? |
|
- example_title: Hellriding out of Amber |
|
messages: |
|
- role: system |
|
content: You are a helpful and honest assistant. Please, respond concisely and truthfully. |
|
- role: user |
|
content: Can you recommend a good destination for a hellride out of Amber? |
|
inference: |
|
parameters: |
|
max_new_tokens: 300 |
|
stop: |
|
- <|end_of_text|> |
|
- <|eot_id|> |
|
model-index: |
|
- name: grimjim/grimjim/llama-3-experiment-v1-9B |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 66.41 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 78.56 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 66.71 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 50.7 |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 75.93 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 65.88 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=grimjim/llama-3-experiment-v1-9B |
|
name: Open LLM Leaderboard |
|
--- |
|
# llama-3-experiment-v1-9B |
|
|
|
This is an experimental merge, replicating additional layers to the model without post-merge healing. |
|
There is damage to the model, but it appears to be tolerable as is; the performance difference in benchmarks from the original 8B Instruct model does not appear to be significant. |
|
The resulting impact on narrative text completion may also be of interest. |
|
|
|
Light testing performed with instruct prompting and the following sampler settings: |
|
- temp=1 and minP=0.02 |
|
- temp=1 and smoothing factor=0.33 |
|
|
|
Full weights: [grimjim/llama-3-experiment-v1-9B](https://huggingface.co./grimjim/llama-3-experiment-v1-9B) |
|
|
|
GGUF quants: [grimjim/llama-3-experiment-v1-9B-GGUF](https://huggingface.co./grimjim/llama-3-experiment-v1-9B-GGUF) |
|
|
|
This is a merge of pre-trained language model meta-llama/Meta-Llama-3-8B-Instruct created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
Built with Meta Llama 3. |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the passthrough merge method. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* meta-llama/Meta-Llama-3-8B-Instruct |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: meta-llama/Meta-Llama-3-8B-Instruct |
|
layer_range: [0, 12] |
|
- sources: |
|
- model: meta-llama/Meta-Llama-3-8B-Instruct |
|
layer_range: [8, 32] |
|
merge_method: passthrough |
|
dtype: bfloat16 |
|
|
|
``` |
|
|