Fine-Tuned Deepsek R1 Model

This repository contains a fine-tuned version of the Mistral language model. The fine-tuning was performed using a dataset derived from a CSV file, enabling the model to specialize in tasks related to the specific context of the dataset.

Model Details

Base Model: Deepsek Instruct (base version)
Fine-Tuning Framework: Unsloth and Hugging Face Transformers
Dataset: 141 rows of input-output pairs derived from a CSV file
Objective: Enhance the model's capability to generate accurate and contextually appropriate responses for tasks specific to the provided dataset.

Dataset

The dataset used for fine-tuning contains conversational data structured as follows:

Input: User queries or prompts
Output: Model-generated responses or target answers

Example Entry

{
  "conversations": [
    { "from": "human", "value": "<input-text>" },
    { "from": "gpt", "value": "<output-text>" }
  ]
}

Fine-Tuning Process

Preprocessing:
- Converted the CSV file into a JSON format compatible with the Mistral model using the ShareGPT template.
- Applied tokenization and ensured compatibility with the Mistral chat template.
Training Configuration:
- Epochs: 30
- Batch Size: 2 (per device)
- Gradient Accumulation: 4 steps
- Optimizer: AdamW with 8-bit precision
- Learning Rate: 2e-4
Hardware:
- Training was conducted on a single GPU.
Frameworks:
- Unsloth for chat template handling and training
- Hugging Face Transformers for model fine-tuning

Installation and Setup

Prerequisites

Python 3.8+

Install dependencies:

pip install torch transformers datasets unsloth

Usage

To use the fine-tuned model, load it with the Hugging Face Transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("path_to_your_finetuned_model")
tokenizer = AutoTokenizer.from_pretrained("path_to_your_finetuned_model")

# Generate a response
input_text = "<your input>"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Inference Example

input_text = "What is the weather like today?"
response = get_response(input_text)
print(response)

Results

The fine-tuned model achieved:

Improved Response Quality: The model generates responses closely aligned with the target dataset.
Faster Convergence: Optimized for a small dataset with minimal overfitting.

Limitations

Dataset Size: The model was fine-tuned on a small dataset (141 rows), which may limit generalization to other tasks.
Domain-Specific: Performance is optimal for the domain represented by the dataset.

Acknowledgments

Special thanks to the open-source AI community for providing tools like Unsloth and Hugging Face Transformers. Their contributions make fine-tuning large language models accessible to all.

Fine-Tuned Mistral Model

Example Chat Conversation

Below is an example conversation showcasing the capabilities of the fine-tuned Mistral model. The model is optimized for customer service in the FMCG industry and supports use cases like recommending healthy snacks.

Input Data Example

Field	Value
doc_id	cs001
last_update	2024-12-06
use_case	customer service
topic	FMCG
stage	awareness
input	Halo, aku cari camilan sehat nih.
language	bahasa
intent	menyapa dan memulai pencarian produk
input_emotion	penasaran
instruction	Sapa pelanggan dengan hangat, tawarkan beberapa rekomendasi camilan sehat, dan ajak mereka untuk memilih kategori tertentu.
output	Hai Kak! Aku [ai_agent_name] 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴
output_tone	ramah, santai
metadata/product_category/0	snacks
metadata/dynamic_placeholders/0	[ai_agent_name]

Example Conversation

Human:

Halo, aku cari camilan sehat nih.

Assistant:

Hai Kak! Aku SnackBot 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴

This demonstrates the model's ability to:

Understand intent: Identify the user's need for healthy snacks.
Generate relevant responses: Provide a list of options in a friendly tone.
Utilize placeholders dynamically: Replace [ai_agent_name] with a suitable agent name, e.g., SnackBot.

For more usage examples, refer to the instructions in the dataset or try interacting with the model directly!

License

This project is licensed under the MIT License.

Feel free to raise any issues or contribute improvements to this repository!

Uploaded model

Developed by: ahsanf
License: apache-2.0
Finetuned from model : unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

ahsanf
/

lexia-finetuned-DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit-v.0.0.2