Model Description
This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit, specifically tailored for mental health counseling tasks. It has been trained on the Amod/mental_health_counseling_conversations dataset for 10 epochs using two H100 GPUs.
Key Features
- Base Model: Utilizes the DeepSeek-R1 architecture, known for its powerful reasoning capabilities13.
- Distillation: Leverages knowledge distillation techniques to compress the larger DeepSeek-R1 model into a more efficient 8B parameter Llama-based version13.
- Quantization: Employs Unsloth's dynamic 4-bit quantization for reduced memory footprint and faster inference59.
- Domain Specialization: Fine-tuned on a dataset of mental health counseling conversations, enhancing its ability to understand and respond to mental health-related queries68.
Training Details
- Dataset: Amod/mental_health_counseling_conversations, containing 3,512 Q&A pairs from counseling platforms68.
- Training Duration: 10 epochs
- Hardware: Two H100 GPUs
Potential Applications
This model could be particularly useful for:
- Prototyping mental health chatbots
- Assisting in mental health research
- Providing initial screening or support in mental health contexts
Limitations and Ethical Considerations
While this model has been trained on mental health counseling data, it's crucial to note:
- It should not replace professional mental health care or diagnosis.
- The model may have biases or limitations based on its training data.
- Ethical use and privacy considerations are paramount when dealing with sensitive mental health information.
- Downloads last month
- 124
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model’s pipeline type.
Model tree for vivirocks/Wayfair-Garage
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B