license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.1-8B
pipeline_tag: text-generation
Cat1.0
Overview
Cat1.0 is a fine-tuned version of Llama-3-1-8b base model, optimized for roleplaying, logic, and reasoning tasks. Utilizing iterative fine-tuning and human-AI chat logs, this model works well for numerous chat scenarios.
Model Specifications
- Parameters: 8 Billion (8B)
- Precision: bf16 (Brain Floating Point 16-bit)
- Fine-Tuning Method: LoRa (Low-Rank Adaptation)
- Lora Rank: 32
- Lora Alpha: 64
- Learning Rate: 0.0008
- Training Epochs: 4
- Datasets Used:
- cat1.0 Roleplay Dataset
- cat1.0 Reasoning and Logic Dataset
- Fine-Tuning Approach: Iterative Fine-Tuning using self-chat logs
Recommended Settings
To achieve optimal performance with this model, I recommend the following settings:
- Temperature:
1.1
- Min P:
0.05
Note: Due to the nature of the fine-tuning, setting the temperature to
1.1
or higher helps prevent the model from repeating itself and encourages more creative and coherent responses.
Usage Instructions
I recommend using the oobabooga text-generation-webui for an optimal experience. Load the model in bf16
precision and enable flash-attention2
for improved performance.
Installation Steps
Clone the WebUI Repository:
git clone https://github.com/oobabooga/text-generation-webui cd text-generation-webui
Install Dependencies:
pip install -r requirements.txt
Download the Model:
Download the fine-tuned model from Hugging Face and place it in the
models
directory.Launch the WebUI:
python server.py --bf16 --flash-attention
Sample Prompt Formats
You can interact with the model using either chat format or chat-instruct format. Here's an example:
Ryan is a computer engineer who works at Intel.
Ryan: Hey, how's it going Natalie?
Natalie: Good, how are things going with you, Ryan?
Ryan: Great, I'm just doing just great.
Model Capabilities
Below are some examples showcasing the model's performance in various roleplay scenarios:
Roleplay Examples
Text Generation Example
Limitations and Tips
While this model excels in chat and roleplaying scenarios, it isn't perfect. If you notice the model repeating itself or providing less coherent responses:
- Increase the Temperature: Setting the temperature higher (≥
1.1
) can help generate more diverse and creative outputs. - Adjust
min_p
Setting: Ensuringmin_p
is at least0.05
can prevent low-probability tokens from being excluded, enhancing the response quality.
Acknowledgments
- oobabooga text-generation-webui: A powerful interface for running and interacting with language models. GitHub Repository
- Hugging Face: For hosting the model and providing a platform for collaboration. Website
- Meta For pre-training the Llama-3.1-8B Base Model that was used for fine-tuning. Model Card
For any issues or questions, please open an issue in this repository.