metadata

language:
  - en
license: other
library_name: transformers
tags:
  - generated_from_trainer
base_model:
  - Qwen/Qwen2.5-7B-Instruct
datasets:
  - Magpie-Align/Magpie-Qwen2.5-Pro-300K-Filtered
license_name: qwen
license_link: https://huggingface.co./Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
model-index:
  - name: cybertron-v4-qw7B-MGS
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 62.64
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 37.04
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 27.72
            name: exact match
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 8.05
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 13.2
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 38.59
            name: accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
          name: Open LLM Leaderboard

cybertron-v4-qw7B-MGS

Introducing: cybertron-v4 based on Qwen2.5 7B SFT over Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1

Training procedure

1 Epoch as usual.

Training hyperparameters

The following hyperparameters were used during training:

seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 128
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
0.7405	0.0007	1	0.5760
0.6146	0.0502	71	0.5045
0.5908	0.1003	142	0.4930
0.5669	0.1505	213	0.4854
0.5575	0.2007	284	0.4811
0.535	0.2508	355	0.4765
0.5161	0.3010	426	0.4736
0.5268	0.3511	497	0.4726
0.5119	0.4013	568	0.4701
0.5329	0.4515	639	0.4687
0.5167	0.5016	710	0.4673
0.5105	0.5518	781	0.4660
0.5203	0.6020	852	0.4653
0.5035	0.6521	923	0.4646
0.4903	0.7023	994	0.4641
0.5031	0.7525	1065	0.4628
0.5147	0.8026	1136	0.4629
0.5037	0.8528	1207	0.4620
0.5029	0.9029	1278	0.4620
0.492	0.9531	1349	0.4621

Framework versions

PEFT 0.13.2
Transformers 4.45.2
Pytorch 2.3.0+cu121
Datasets 3.0.1
Tokenizers 0.20.1

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	31.21
IFEval (0-Shot)	62.64
BBH (3-Shot)	37.04
MATH Lvl 5 (4-Shot)	27.72
GPQA (0-shot)	8.05
MuSR (0-shot)	13.20
MMLU-PRO (5-shot)	38.59