QuantFactory Banner

QuantFactory/Qwen2.5-7B-HomerAnvita-NerdMix-GGUF

This is quantized version of ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix created using llama.cpp

Original Model Card

ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix

ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix is an advanced language model meticulously crafted by merging five pre-trained models using the powerful mergekit framework. This fusion leverages the Model Stock merge method to combine the creative prowess of Qandora, the instructive capabilities of Qwen-Instruct-Fusion, the sophisticated blending of HomerSlerp1, the mathematical precision of Cybertron-MGS, and the uncensored expertise of Qwen-Nerd. The resulting model excels in creative text generation, contextual understanding, technical reasoning, and dynamic conversational interactions.

🚀 Merged Models

This model merge incorporates the following:

  • bunnycore/Qandora-2.5-7B-Creative: Specializes in creative text generation, enhancing the model's ability to produce imaginative and diverse content.

  • allknowingroger/HomerSlerp1-7B: Utilizes spherical linear interpolation (SLERP) to blend model weights smoothly, ensuring a harmonious integration of different model attributes.

  • sethuiyer/Qwen2.5-7B-Anvita: Focuses on instruction-following capabilities, improving the model's performance in understanding and executing user commands.

  • fblgit/cybertron-v4-qw7B-MGS: Enhances mathematical reasoning and precision, enabling the model to handle complex computational tasks effectively.

  • jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0: Provides uncensored expertise and robust technical knowledge, making the model suitable for specialized technical support and information retrieval.

  • newsbang/Homer-v0.5-Qwen2.5-7B: Acts as the foundational conversational model, providing robust language comprehension and generation capabilities.

🧩 Merge Configuration

The configuration below outlines how the models are merged using the Model Stock method. This approach ensures a balanced and effective integration of the unique strengths from each source model.

# Merge configuration for ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix using Model Stock

models:
  - model: bunnycore/Qandora-2.5-7B-Creative
  - model: allknowingroger/HomerSlerp1-7B
  - model: sethuiyer/Qwen2.5-7B-Anvita
  - model: fblgit/cybertron-v4-qw7B-MGS
  - model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
merge_method: model_stock
base_model: newsbang/Homer-v0.5-Qwen2.5-7B
normalize: false
int8_mask: true
dtype: bfloat16

Key Parameters

  • Merge Method (merge_method): Utilizes the Model Stock method, as described in Model Stock, to effectively combine multiple models by leveraging their strengths.

  • Models (models): Specifies the list of models to be merged:

    • bunnycore/Qandora-2.5-7B-Creative: Enhances creative text generation.
    • allknowingroger/HomerSlerp1-7B: Facilitates smooth blending of model weights using SLERP.
    • sethuiyer/Qwen2.5-7B-Anvita: Improves instruction-following capabilities.
    • fblgit/cybertron-v4-qw7B-MGS: Enhances mathematical reasoning and precision.
    • jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0: Provides uncensored technical expertise.
  • Base Model (base_model): Defines the foundational model for the merge, which is newsbang/Homer-v0.5-Qwen2.5-7B in this case.

  • Normalization (normalize): Set to false to retain the original scaling of the model weights during the merge.

  • INT8 Mask (int8_mask): Enabled (true) to apply INT8 quantization masking, optimizing the model for efficient inference without significant loss in precision.

  • Data Type (dtype): Uses bfloat16 to maintain computational efficiency while ensuring high precision.

🏆 Performance Highlights

  • Creative Text Generation: Enhanced ability to produce imaginative and diverse content suitable for creative writing, storytelling, and content creation.

  • Instruction Following: Improved performance in understanding and executing user instructions, making the model more responsive and accurate in task execution.

  • Mathematical Reasoning: Enhanced capability to handle complex computational tasks with high precision, suitable for technical and analytical applications.

  • Uncensored Technical Expertise: Provides robust technical knowledge without content restrictions, making it ideal for specialized technical support and information retrieval.

  • Optimized Inference: INT8 masking and bfloat16 data type contribute to efficient computation, enabling faster response times without compromising quality.

🎯 Use Case & Applications

ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix is designed to excel in environments that demand a combination of creative generation, precise instruction following, mathematical reasoning, and technical expertise. Ideal applications include:

  • Creative Writing Assistance: Aiding authors and content creators in generating imaginative narratives, dialogues, and descriptive text.

  • Interactive Storytelling and Role-Playing: Enhancing dynamic and engaging interactions in role-playing games and interactive storytelling platforms.

  • Educational Tools and Tutoring Systems: Providing detailed explanations, answering questions, and assisting in educational content creation with contextual understanding.

  • Technical Support and Customer Service: Offering accurate and contextually relevant responses in technical support scenarios, improving user satisfaction.

  • Content Generation for Marketing: Creating compelling and diverse marketing copy, social media posts, and promotional material with creative flair.

  • Mathematical Problem Solving: Assisting in solving complex mathematical problems and providing step-by-step explanations for educational purposes.

  • Technical Documentation and Analysis: Generating detailed technical documents, reports, and analyses with high precision and clarity.

📝 Usage

To utilize ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix, follow the steps below:

Installation

First, install the necessary libraries:

pip install -qU transformers accelerate

Example Code

Below is an example of how to load and use the model for text generation:

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Define the model name
model_name = "ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Initialize the pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Define the input prompt
prompt = "Explain the significance of artificial intelligence in modern healthcare."

# Generate the output
outputs = text_generator(
    prompt,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

# Print the generated text
print(outputs[0]["generated_text"])

Notes

  • Fine-Tuning: This merged model may require fine-tuning to optimize performance for specific applications or domains.

  • Resource Requirements: Ensure that your environment has sufficient computational resources, especially GPU-enabled hardware, to handle the model efficiently during inference.

  • Customization: Users can adjust parameters such as temperature, top_k, and top_p to control the creativity and diversity of the generated text.

📜 License

This model is open-sourced under the Apache-2.0 License.

💡 Tags

  • merge
  • mergekit
  • model_stock
  • Qwen
  • Homer
  • Anvita
  • Nerd
  • ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
  • bunnycore/Qandora-2.5-7B-Creative
  • allknowingroger/HomerSlerp1-7B
  • sethuiyer/Qwen2.5-7B-Anvita
  • fblgit/cybertron-v4-qw7B-MGS
  • jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
  • newsbang/Homer-v0.5-Qwen2.5-7B

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 34.17
IFEval (0-Shot) 77.08
BBH (3-Shot) 36.58
MATH Lvl 5 (4-Shot) 29.53
GPQA (0-shot) 9.28
MuSR (0-shot) 14.41
MMLU-PRO (5-shot) 38.13
Downloads last month
208
GGUF
Model size
7.62B params
Architecture
qwen2

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QuantFactory/Qwen2.5-7B-HomerAnvita-NerdMix-GGUF

Evaluation results