Fine-Tuned Deepsek R1 Model
This repository contains a fine-tuned version of the Mistral language model. The fine-tuning was performed using a dataset derived from a CSV file, enabling the model to specialize in tasks related to the specific context of the dataset.
Model Details
- Base Model: Deepsek Instruct (base version)
- Fine-Tuning Framework: Unsloth and Hugging Face Transformers
- Dataset: 141 rows of input-output pairs derived from a CSV file
- Objective: Enhance the model's capability to generate accurate and contextually appropriate responses for tasks specific to the provided dataset.
Dataset
The dataset used for fine-tuning contains conversational data structured as follows:
- Input: User queries or prompts
- Output: Model-generated responses or target answers
Example Entry
{
"conversations": [
{ "from": "human", "value": "<input-text>" },
{ "from": "gpt", "value": "<output-text>" }
]
}
Fine-Tuning Process
Preprocessing:
- Converted the CSV file into a JSON format compatible with the Mistral model using the ShareGPT template.
- Applied tokenization and ensured compatibility with the Mistral chat template.
Training Configuration:
- Epochs: 30
- Batch Size: 2 (per device)
- Gradient Accumulation: 4 steps
- Optimizer: AdamW with 8-bit precision
- Learning Rate: 2e-4
Hardware:
- Training was conducted on a single GPU.
Frameworks:
- Unsloth for chat template handling and training
- Hugging Face Transformers for model fine-tuning
Installation and Setup
Prerequisites
- Python 3.8+
- Install dependencies:
pip install torch transformers datasets unsloth
Usage
To use the fine-tuned model, load it with the Hugging Face Transformers library:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("path_to_your_finetuned_model")
tokenizer = AutoTokenizer.from_pretrained("path_to_your_finetuned_model")
# Generate a response
input_text = "<your input>"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Inference Example
input_text = "What is the weather like today?"
response = get_response(input_text)
print(response)
Results
The fine-tuned model achieved:
- Improved Response Quality: The model generates responses closely aligned with the target dataset.
- Faster Convergence: Optimized for a small dataset with minimal overfitting.
Limitations
- Dataset Size: The model was fine-tuned on a small dataset (141 rows), which may limit generalization to other tasks.
- Domain-Specific: Performance is optimal for the domain represented by the dataset.
Acknowledgments
Special thanks to the open-source AI community for providing tools like Unsloth and Hugging Face Transformers. Their contributions make fine-tuning large language models accessible to all.
Fine-Tuned Mistral Model
Example Chat Conversation
Below is an example conversation showcasing the capabilities of the fine-tuned Mistral model. The model is optimized for customer service in the FMCG industry and supports use cases like recommending healthy snacks.
Input Data Example
Field | Value |
---|---|
doc_id | cs001 |
last_update | 2024-12-06 |
use_case | customer service |
topic | FMCG |
stage | awareness |
input | Halo, aku cari camilan sehat nih. |
language | bahasa |
intent | menyapa dan memulai pencarian produk |
input_emotion | penasaran |
instruction | Sapa pelanggan dengan hangat, tawarkan beberapa rekomendasi camilan sehat, dan ajak mereka untuk memilih kategori tertentu. |
output | Hai Kak! Aku [ai_agent_name] 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴 |
output_tone | ramah, santai |
metadata/product_category/0 | snacks |
metadata/dynamic_placeholders/0 | [ai_agent_name] |
Example Conversation
Human:
Halo, aku cari camilan sehat nih.
Assistant:
Hai Kak! Aku SnackBot 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴
This demonstrates the model's ability to:
- Understand intent: Identify the user's need for healthy snacks.
- Generate relevant responses: Provide a list of options in a friendly tone.
- Utilize placeholders dynamically: Replace
[ai_agent_name]
with a suitable agent name, e.g., SnackBot.
For more usage examples, refer to the instructions in the dataset or try interacting with the model directly!
License
This project is licensed under the MIT License.
Feel free to raise any issues or contribute improvements to this repository!
Uploaded model
- Developed by: ahsanf
- License: apache-2.0
- Finetuned from model : unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 784
Model tree for ahsanf/lexia-finetuned-DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit-v.0.0.2
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B