🎉 Model Card for Torque_1.5B_TINY_0.1 🚀

Part of the Torque Series (TINY, MED, LARGE).

🔎 Model Details

📝 Model Description

Torque_1.5B_TINY_0.1 is a lightweight 1.5B-parameter Transformer-based language model that leverages a two-stage Reinforcement Learning (RL) approach to boost its chain-of-thought reasoning capabilities. Trained on curated datasets spanning math, code, and QA, this model is ideal for resource-conscious users needing strong reasoning and generation features without massive hardware.

Developed by: The Mind Expansion Network
Funded by [optional]: Internal R&D
Shared by [optional]: MindExpander
Model type: AutoRegressive LLM
Language(s) (NLP): Primarily English
License: MIT (with Qwen/Llama base constraints)
Finetuned from model [optional]: Qwen2.5 / Llama3-based

🌐 Model Sources [optional]

Repository: TheMindExpansionNetwork/Torque_1.5B_TINY_0.1
Paper [optional]: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs
Demo [optional]: Coming soon

📦 Uses

🎯 Direct Use

Conversational AI 🤖: Chatbots, helpdesks, or personal assistants with moderate resource requirements.
Academic & Research 📚: Quick iteration on math, code, or logic tasks.
Prototyping 🧪: Use in early-stage projects to validate feasibility before scaling to bigger models.

🔧 Downstream Use [optional]

Fine-Tuning: Users can fine-tune Torque_1.5B_TINY_0.1 for specialized tasks, e.g., domain-specific QA or custom logic.
Integration: Embedding into toolchains or auto-agent stacks for advanced multi-step problem-solving.

🚫 Out-of-Scope Use

High-Level Enterprise Deployments: If large-scale, high-accuracy solutions are needed, see Torque_14B_MED_0.2 or Torque_32B_LARGE_0.3.

⚖️ Bias, Risks, and Limitations

Potential Hallucinations: As with all LLMs, the model might produce plausible-sounding but incorrect answers.
Bias: The training data could contain hidden biases.
Security: Malicious prompts can lead to undesirable outputs.

✅ Recommendations

Always Validate: Verify critical code outputs or logic steps.
Filter Output: Use content filtering for sensitive applications.

🚀 How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "TheMindExpansionNetwork/Torque_1.5B_TINY_0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True).half().cuda()

prompt = "Explain the significance of Moore's law in modern computing."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🏗️ Training Details

⚙️ Training Data

800k Curated Samples: Balanced coverage of math problem-solving, coding tasks, and domain QA.
Sources: Legally compliant open datasets + proprietary in-house data.

🔨 Training Procedure

🔍 Preprocessing [optional]

Tokenization with an extended vocabulary to accommodate math symbols and code tokens.

⚡ Training Hyperparameters

Precision: Mixed precision (fp16)
Batch Size: [More Information Needed]
Max Steps: [More Information Needed]
Learning Rate: [More Information Needed]

🏎️ Speeds, Sizes, Times [optional]

Approx. [More Information Needed]

✅ Evaluation

🔎 Testing Data, Factors & Metrics

📂 Testing Data

Internal test sets for math, code, and QA.

📊 Factors

Variety of topics and complexity to ensure broad coverage.

🎯 Metrics

Accuracy (Pass@1 on math/code)
Fluency (Human eval for QA tasks)

📈 Results

MATH-500: ~83.9% Pass@1
CodeForces Rating: ~954
LiveCodeBench: 16.9% Pass@1

🔬 Model Examination [optional]

Pending.

🌱 Environmental Impact

Hardware Type: Single GPU node for final distillation
Hours used: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

🔧 Technical Specifications [optional]

🏗️ Model Architecture and Objective

1.5B Parameter Transformer, autoregressive generation, specialized in chain-of-thought.

🏭 Compute Infrastructure

Hardware: [More Information Needed]
Software: PyTorch, Transformers, CUDA

📖 Citation [optional]

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
  title={{DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}},
  author={DeepSeek-AI and collaborators},
  year={2025},
  eprint={2501.12948},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2501.12948}
}

🗒️ Glossary [optional]

Chain-of-Thought: The model’s ability to articulate intermediate steps in a reasoning process.

📚 More Information [optional]

Torque_1.5B_TINY_0.1 is part of a wider “Torque Series” of LLMs, including Torque_14B_MED_0.2 and Torque_32B_LARGE_0.3, designed to work synergistically or stand alone. See the Torque Series Card for an overview.

🤝 Model Card Authors [optional]

MindExpander, TheMindExpansionNetwork

📬 Model Card Contact

Email: [[email protected]]

TheMindExpansionNetwork
/

Torque_1.5B_TINY_0.1