Zirel_1.5 / README.md
Daemontatox's picture
Adding Evaluation Results (#1)
7e76368 verified
metadata
base_model: unsloth/qwen2.5-coder-1.5b-instruct
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2
  - trl
license: apache-2.0
language:
  - en
datasets:
  - Daemontatox/math_conv
library_name: transformers
model-index:
  - name: Zirel_1.5
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: wis-k/instruction-following-eval
          split: train
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 41.68
            name: averaged accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: SaylorTwift/bbh
          split: test
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 15.08
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: lighteval/MATH-Hard
          split: test
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 11.33
            name: exact match
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          split: train
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 1.34
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.33
            name: acc_norm
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 12.71
            name: accuracy
        source:
          url: >-
            https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FZirel_1.5
          name: Open LLM Leaderboard

Zireal 1.5 - Fast Reasoning Model

Developed by: Daemontatox
Finetuned from: unsloth/qwen2.5-coder-1.5b-instruct
License: Apache 2.0

Overview

Zireal 1.5 is a fast, efficient reasoning model designed for structured problem-solving and mathematical inference. It has been fine-tuned using GRPO (General Reinforcement Policy Optimization) on 24,000 high-quality mathematical examples, making it highly effective for step-by-step reasoning and logic-based tasks.

Features

  • Optimized for fast, structured reasoning with minimal computational overhead.
  • GRPO-trained for superior decision-making in mathematical contexts.
  • Lightweight yet highly capable, leveraging Qwen2.5's instruction-tuned efficiency.
  • Ideal for logic, algebra, arithmetic, and structured problem-solving.

Usage

You can load Zireal 1.5 using transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Daemontatox/Zireal-1.5"  # Replace with actual model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer("Solve: 3x - 7 = 11", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

  • Mathematical reasoning (algebra, arithmetic, logic-based problems).
  • Step-by-step structured problem-solving for computational tasks.
  • Lightweight inference for fast, efficient reasoning applications.

Limitations

  • Primarily designed for structured reasoning rather than open-ended text generation.
  • Best suited for logic and mathematics rather than creative or conversational AI.

Acknowledgments

Zireal 1.5 is part of the Zireal model series, focusing on efficient and necessary reasoning. It is built on Qwen2.5 and optimized using Unsloth for high-performance inference.


๐Ÿ”— Hugging Face Model Card (Replace with actual link)
๐Ÿ“œ License: Apache 2.0

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/Daemontatox__Zirel_1.5-details)!
Summarized results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FZirel_1.5&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!

|      Metric       |Value (%)|
|-------------------|--------:|
|**Average**        |    14.24|
|IFEval (0-Shot)    |    41.68|
|BBH (3-Shot)       |    15.08|
|MATH Lvl 5 (4-Shot)|    11.33|
|GPQA (0-shot)      |     1.34|
|MuSR (0-shot)      |     3.33|
|MMLU-PRO (5-shot)  |    12.71|