jlzhou's picture
Update README.md
e45d457 verified
metadata
license: other
library_name: transformers
base_model:
  - Qwen/Qwen2.5-3B
datasets:
  - BAAI/Infinity-Instruct
license_name: qwen-research
license_link: https://huggingface.co./Qwen/Qwen2.5-3B/blob/main/LICENSE
pipeline_tag: text-generation
model-index:
  - name: Qwen2.5-3B-Infinity-Instruct-0625
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 35.58
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 26.91
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 2.04
            name: exact match
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 2.57
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 8.13
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 24.43
            name: accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
          name: Open LLM Leaderboard

Model Card for Model ID

Model Details

This is the model fine-tuned in this blog.

This model is fine-tuned on Qwen/Qwen2.5-3B, with BAAI/Infinity-Instruct dataset (subset 0625). You can find more details in the blog post.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "jlzhou/Qwen2.5-3B-Infinity-Instruct-0625"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Training Details

Training Data

This model is trained on https://huggingface.co./datasets/BAAI/Infinity-Instruct

Training Hyperparameters

This model follows the recommended hyperparameters from https://huggingface.co./BAAI/Infinity-Instruct-3M-0625-Qwen2-7B#training-details

Speeds, Sizes, Times [optional]

[More Information Needed]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 16.61
IFEval (0-Shot) 35.58
BBH (3-Shot) 26.91
MATH Lvl 5 (4-Shot) 2.04
GPQA (0-shot) 2.57
MuSR (0-shot) 8.13
MMLU-PRO (5-shot) 24.43