metadata
license: other
library_name: transformers
base_model:
- Qwen/Qwen2.5-3B
datasets:
- BAAI/Infinity-Instruct
license_name: qwen-research
license_link: https://huggingface.co./Qwen/Qwen2.5-3B/blob/main/LICENSE
pipeline_tag: text-generation
model-index:
- name: Qwen2.5-3B-Infinity-Instruct-0625
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 35.58
name: strict accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 26.91
name: normalized accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 2.04
name: exact match
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 2.57
name: acc_norm
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 8.13
name: acc_norm
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 24.43
name: accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard?query=jlzhou/Qwen2.5-3B-Infinity-Instruct-0625
name: Open LLM Leaderboard
Model Card for Model ID
Model Details
This is the model fine-tuned in this blog.
This model is fine-tuned on Qwen/Qwen2.5-3B, with BAAI/Infinity-Instruct dataset (subset 0625). You can find more details in the blog post.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "jlzhou/Qwen2.5-3B-Infinity-Instruct-0625"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Training Details
Training Data
This model is trained on https://huggingface.co./datasets/BAAI/Infinity-Instruct
Training Hyperparameters
This model follows the recommended hyperparameters from https://huggingface.co./BAAI/Infinity-Instruct-3M-0625-Qwen2-7B#training-details
Speeds, Sizes, Times [optional]
[More Information Needed]
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 16.61 |
IFEval (0-Shot) | 35.58 |
BBH (3-Shot) | 26.91 |
MATH Lvl 5 (4-Shot) | 2.04 |
GPQA (0-shot) | 2.57 |
MuSR (0-shot) | 8.13 |
MMLU-PRO (5-shot) | 24.43 |