Fine-Tuned Deepsek R1 Model

This repository contains a fine-tuned version of the Mistral language model. The fine-tuning was performed using a dataset derived from a CSV file, enabling the model to specialize in tasks related to the specific context of the dataset.

Model Details

  • Base Model: Deepsek Instruct (base version)
  • Fine-Tuning Framework: Unsloth and Hugging Face Transformers
  • Dataset: 141 rows of input-output pairs derived from a CSV file
  • Objective: Enhance the model's capability to generate accurate and contextually appropriate responses for tasks specific to the provided dataset.

Dataset

The dataset used for fine-tuning contains conversational data structured as follows:

  • Input: User queries or prompts
  • Output: Model-generated responses or target answers

Example Entry

{
  "conversations": [
    { "from": "human", "value": "<input-text>" },
    { "from": "gpt", "value": "<output-text>" }
  ]
}

Fine-Tuning Process

  1. Preprocessing:

    • Converted the CSV file into a JSON format compatible with the Mistral model using the ShareGPT template.
    • Applied tokenization and ensured compatibility with the Mistral chat template.
  2. Training Configuration:

    • Epochs: 30
    • Batch Size: 2 (per device)
    • Gradient Accumulation: 4 steps
    • Optimizer: AdamW with 8-bit precision
    • Learning Rate: 2e-4
  3. Hardware:

    • Training was conducted on a single GPU.
  4. Frameworks:

Installation and Setup

Prerequisites

  • Python 3.8+
  • Install dependencies:
    pip install torch transformers datasets unsloth
    

Usage

To use the fine-tuned model, load it with the Hugging Face Transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("path_to_your_finetuned_model")
tokenizer = AutoTokenizer.from_pretrained("path_to_your_finetuned_model")

# Generate a response
input_text = "<your input>"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Inference Example

input_text = "What is the weather like today?"
response = get_response(input_text)
print(response)

Results

The fine-tuned model achieved:

  • Improved Response Quality: The model generates responses closely aligned with the target dataset.
  • Faster Convergence: Optimized for a small dataset with minimal overfitting.

Limitations

  • Dataset Size: The model was fine-tuned on a small dataset (141 rows), which may limit generalization to other tasks.
  • Domain-Specific: Performance is optimal for the domain represented by the dataset.

Acknowledgments

Special thanks to the open-source AI community for providing tools like Unsloth and Hugging Face Transformers. Their contributions make fine-tuning large language models accessible to all.

Fine-Tuned Mistral Model

Example Chat Conversation

Below is an example conversation showcasing the capabilities of the fine-tuned Mistral model. The model is optimized for customer service in the FMCG industry and supports use cases like recommending healthy snacks.

Input Data Example

Field Value
doc_id cs001
last_update 2024-12-06
use_case customer service
topic FMCG
stage awareness
input Halo, aku cari camilan sehat nih.
language bahasa
intent menyapa dan memulai pencarian produk
input_emotion penasaran
instruction Sapa pelanggan dengan hangat, tawarkan beberapa rekomendasi camilan sehat, dan ajak mereka untuk memilih kategori tertentu.
output Hai Kak! Aku [ai_agent_name] 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴
output_tone ramah, santai
metadata/product_category/0 snacks
metadata/dynamic_placeholders/0 [ai_agent_name]

Example Conversation

Human:

Halo, aku cari camilan sehat nih.

Assistant:

Hai Kak! Aku SnackBot 😊. Lagi cari camilan sehat ya? Kita punya banyak pilihan nih, mulai dari granola bar, kacang panggang, sampai keripik sayur. Kakak lagi butuh yang praktis buat dibawa atau buat ngemil di rumah? 🛍️🍴


This demonstrates the model's ability to:

  1. Understand intent: Identify the user's need for healthy snacks.
  2. Generate relevant responses: Provide a list of options in a friendly tone.
  3. Utilize placeholders dynamically: Replace [ai_agent_name] with a suitable agent name, e.g., SnackBot.

For more usage examples, refer to the instructions in the dataset or try interacting with the model directly!

License

This project is licensed under the MIT License.


Feel free to raise any issues or contribute improvements to this repository!

Uploaded model

  • Developed by: ahsanf
  • License: apache-2.0
  • Finetuned from model : unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
784
GGUF
Model size
8.03B params
Architecture
llama

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for ahsanf/lexia-finetuned-DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit-v.0.0.2