🎉 Model Card for Torque_1.5B_TINY_0.1 🚀
Part of the Torque Series (TINY, MED, LARGE).
🔎 Model Details
📝 Model Description
Torque_1.5B_TINY_0.1 is a lightweight 1.5B-parameter Transformer-based language model that leverages a two-stage Reinforcement Learning (RL) approach to boost its chain-of-thought reasoning capabilities. Trained on curated datasets spanning math, code, and QA, this model is ideal for resource-conscious users needing strong reasoning and generation features without massive hardware.
- Developed by: The Mind Expansion Network
- Funded by [optional]: Internal R&D
- Shared by [optional]: MindExpander
- Model type: AutoRegressive LLM
- Language(s) (NLP): Primarily English
- License: MIT (with Qwen/Llama base constraints)
- Finetuned from model [optional]: Qwen2.5 / Llama3-based
🌐 Model Sources [optional]
- Repository: TheMindExpansionNetwork/Torque_1.5B_TINY_0.1
- Paper [optional]: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs
- Demo [optional]: Coming soon
📦 Uses
🎯 Direct Use
- Conversational AI 🤖: Chatbots, helpdesks, or personal assistants with moderate resource requirements.
- Academic & Research 📚: Quick iteration on math, code, or logic tasks.
- Prototyping 🧪: Use in early-stage projects to validate feasibility before scaling to bigger models.
🔧 Downstream Use [optional]
- Fine-Tuning: Users can fine-tune Torque_1.5B_TINY_0.1 for specialized tasks, e.g., domain-specific QA or custom logic.
- Integration: Embedding into toolchains or auto-agent stacks for advanced multi-step problem-solving.
🚫 Out-of-Scope Use
- High-Level Enterprise Deployments: If large-scale, high-accuracy solutions are needed, see Torque_14B_MED_0.2 or Torque_32B_LARGE_0.3.
⚖️ Bias, Risks, and Limitations
- Potential Hallucinations: As with all LLMs, the model might produce plausible-sounding but incorrect answers.
- Bias: The training data could contain hidden biases.
- Security: Malicious prompts can lead to undesirable outputs.
✅ Recommendations
- Always Validate: Verify critical code outputs or logic steps.
- Filter Output: Use content filtering for sensitive applications.
🚀 How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "TheMindExpansionNetwork/Torque_1.5B_TINY_0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True).half().cuda()
prompt = "Explain the significance of Moore's law in modern computing."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
🏗️ Training Details
⚙️ Training Data
- 800k Curated Samples: Balanced coverage of math problem-solving, coding tasks, and domain QA.
- Sources: Legally compliant open datasets + proprietary in-house data.
🔨 Training Procedure
🔍 Preprocessing [optional]
- Tokenization with an extended vocabulary to accommodate math symbols and code tokens.
⚡ Training Hyperparameters
- Precision: Mixed precision (fp16)
- Batch Size: [More Information Needed]
- Max Steps: [More Information Needed]
- Learning Rate: [More Information Needed]
🏎️ Speeds, Sizes, Times [optional]
- Approx. [More Information Needed]
✅ Evaluation
🔎 Testing Data, Factors & Metrics
📂 Testing Data
- Internal test sets for math, code, and QA.
📊 Factors
- Variety of topics and complexity to ensure broad coverage.
🎯 Metrics
- Accuracy (Pass@1 on math/code)
- Fluency (Human eval for QA tasks)
📈 Results
- MATH-500: ~83.9% Pass@1
- CodeForces Rating: ~954
- LiveCodeBench: 16.9% Pass@1
🔬 Model Examination [optional]
Pending.
🌱 Environmental Impact
- Hardware Type: Single GPU node for final distillation
- Hours used: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
🔧 Technical Specifications [optional]
🏗️ Model Architecture and Objective
- 1.5B Parameter Transformer, autoregressive generation, specialized in chain-of-thought.
🏭 Compute Infrastructure
- Hardware: [More Information Needed]
- Software: PyTorch, Transformers, CUDA
📖 Citation [optional]
@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
title={{DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}},
author={DeepSeek-AI and collaborators},
year={2025},
eprint={2501.12948},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.12948}
}
🗒️ Glossary [optional]
- Chain-of-Thought: The model’s ability to articulate intermediate steps in a reasoning process.
📚 More Information [optional]
Torque_1.5B_TINY_0.1 is part of a wider “Torque Series” of LLMs, including Torque_14B_MED_0.2
and Torque_32B_LARGE_0.3
, designed to work synergistically or stand alone. See the Torque Series Card for an overview.
🤝 Model Card Authors [optional]
- MindExpander, TheMindExpansionNetwork
📬 Model Card Contact
- Email: [[email protected]]
- Downloads last month
- 13
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.