license: mit
datasets:
- Intel/orca_dpo_pairs
SOLAR-10B-Nectar-Orca-DPO-LoRA-Jawade
Overview
This model is DPO optimized and aligned version of upstage/SOLAR-10.7B-Instruct-v1.0
model. Trained on a mixture of Berkeley-nest Nectar dataset and Intel DPO Orca dataset using LoRA.
How to Use This Model
To use the model bhavinjawade/SOLAR-10B-OrcaDPO-Jawade
, follow these steps:
Import and Load the Model and Tokenizer Begin by importing the model and tokenizer. Load them using the
from_pretrained
method.from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("bhavinjawade/SOLAR-10B-OrcaDPO-Jawade") tokenizer = AutoTokenizer.from_pretrained("bhavinjawade/SOLAR-10B-OrcaDPO-Jawade")
Format the Prompt Format the chat input as a list of messages, each with a role ('system' or 'user') and content.
message = [ {"role": "system", "content": "You are a helpful assistant chatbot."}, {"role": "user", "content": "Is the universe real? or is it a simulation? whats your opinion?"} ] prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
Create a Pipeline Set up a pipeline for text generation with the loaded model and tokenizer.
pipeline = transformers.pipeline( "text-generation", model=model, tokenizer=tokenizer )
Generate Text Use the pipeline to generate a sequence of text based on the prompt. You can adjust parameters like temperature and top_p for different styles of responses.
sequences = pipeline( prompt, do_sample=True, temperature=0.7, top_p=0.9, num_return_sequences=1, max_length=200, ) print(sequences[0]['generated_text'])
This setup allows you to utilize the capabilities of the bhavinjawade/SOLAR-10B-OrcaDPO-Jawade model for generating responses to chat inputs.
License
- Type: MIT License
- Details: This license permits reuse, modification, and distribution for both private and commercial purposes under the terms of the MIT License.
Model Details
- Model Name: SOLAR-10.7B-Instruct-v1.0
- Organization: Upstage
- Training Dataset: Intel/orca_dpo_pairs
- Technique Used: LoRA (Low-Rank Adaptation)