YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

This is recompiled for WASM model: https://huggingface.co./PowerInfer/SmallThinker-3B-Preview

Original description:

datasets: - PowerInfer/QWQ-LONGCOT-500K - PowerInfer/LONGCOT-Refine-500K base_model: - Qwen/Qwen2.5-3B-Instruct pipeline_tag: text-generation language: - en library_name: transformers

SmallThinker-3B-preview

We introduce SmallThinker-3B-preview, a new model fine-tuned from the Qwen2.5-3b-Instruct model.

Benchmark Performance

Model AIME24 AMC23 GAOKAO2024_I GAOKAO2024_II MMLU_STEM AMPS_Hard math_comp
Qwen2.5-3B-Instruct 6.67 45 50 35.8 59.8 - -
SmallThinker 16.667 57.5 64.2 57.1 68.2 70 46.8
GPT-4o 9.3 - - - 64.2 57 50

Limitation: Due to SmallThinker's current limitations in instruction following, for math_comp we adopt a more lenient evaluation method where only correct answers are required, without constraining responses to follow the specified AAAAA format.

Intended Use Cases

SmallThinker is designed for the following use cases:

  1. Edge Deployment: Its small size makes it ideal for deployment on resource-constrained devices.
  2. Draft Model for QwQ-32B-Preview: SmallThinker can serve as a fast and efficient draft model for the larger QwQ-32B-Preview model. From my test, in llama.cpp we can get 70% speedup (from 40 tokens/s to 70 tokens/s).

Training Details

The model was trained using 8 H100 GPUs with a global batch size of 16. The specific configuration is as follows:

neat_packing: true
cutoff_len: 16384
per_device_train_batch_size: 2
gradient_accumulation_steps: 1
learning_rate: 1.0e-5
num_train_epochs: 3
lr_scheduler_type: cosine
warmup_ratio: 0.02
bf16: true
ddp_timeout: 180000000
weight_decay: 0.0

The SFT (Supervised Fine-Tuning) process was conducted in two phases:

  1. First Phase:

    • Used only the PowerInfer/QWQ-LONGCOT-500K dataset
    • Trained for 1.5 epochs
  2. Second Phase:

    • Combined training with PowerInfer/QWQ-LONGCOT-500K and PowerInfer/LONGCOT-Refine datasets
    • Continued training for 2 additional epochs

Limitations & Disclaimer

Please be aware of the following limitations:

  • Language Limitation: The model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
  • Limited Knowledge: Due to limited SFT data and the model's relatively small scale, its reasoning capabilities are constrained by its knowledge base.
  • Unpredictable Outputs: The model may produce unexpected outputs due to its size and probabilistic generation paradigm. Users should exercise caution and validate the model's responses.
  • Repetition Issue: The model tends to repeat itself when answering high-difficulty questions. Please increase the repetition_penalty to mitigate this issue.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.