metadata
base_model: unsloth/qwen2.5-coder-1.5b-instruct
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
license: apache-2.0
language:
- en
datasets:
- Daemontatox/math_conv
library_name: transformers
model-index:
- name: Zirel_1.5
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: wis-k/instruction-following-eval
split: train
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 41.68
name: averaged accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: SaylorTwift/bbh
split: test
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 15.08
name: normalized accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: lighteval/MATH-Hard
split: test
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 11.33
name: exact match
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
split: train
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 1.34
name: acc_norm
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 3.33
name: acc_norm
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 12.71
name: accuracy
source:
url: >-
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
name: Open LLM Leaderboard
Zireal 1.5 - Fast Reasoning Model
Developed by: Daemontatox
Finetuned from: unsloth/qwen2.5-coder-1.5b-instruct
License: Apache 2.0
Overview
Zireal 1.5 is a fast, efficient reasoning model designed for structured problem-solving and mathematical inference. It has been fine-tuned using GRPO (General Reinforcement Policy Optimization) on 24,000 high-quality mathematical examples, making it highly effective for step-by-step reasoning and logic-based tasks.
Features
- Optimized for fast, structured reasoning with minimal computational overhead.
- GRPO-trained for superior decision-making in mathematical contexts.
- Lightweight yet highly capable, leveraging Qwen2.5's instruction-tuned efficiency.
- Ideal for logic, algebra, arithmetic, and structured problem-solving.
Usage
You can load Zireal 1.5 using transformers
:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Daemontatox/Zireal-1.5" # Replace with actual model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
inputs = tokenizer("Solve: 3x - 7 = 11", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Use
- Mathematical reasoning (algebra, arithmetic, logic-based problems).
- Step-by-step structured problem-solving for computational tasks.
- Lightweight inference for fast, efficient reasoning applications.
Limitations
- Primarily designed for structured reasoning rather than open-ended text generation.
- Best suited for logic and mathematics rather than creative or conversational AI.
Acknowledgments
Zireal 1.5 is part of the Zireal model series, focusing on efficient and necessary reasoning. It is built on Qwen2.5 and optimized using Unsloth for high-performance inference.
๐ Hugging Face Model Card (Replace with actual link)
๐ License: Apache 2.0
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/Daemontatox__Zirel_1.5-details)!
Summarized results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FZirel_1.5&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
| Metric |Value (%)|
|-------------------|--------:|
|**Average** | 14.24|
|IFEval (0-Shot) | 41.68|
|BBH (3-Shot) | 15.08|
|MATH Lvl 5 (4-Shot)| 11.33|
|GPQA (0-shot) | 1.34|
|MuSR (0-shot) | 3.33|
|MMLU-PRO (5-shot) | 12.71|