|
--- |
|
library_name: peft |
|
base_model: gpt2-xl |
|
license: mit |
|
language: |
|
- en |
|
--- |
|
|
|
# Chadgpt gpt2-xl |
|
|
|
## Colab Example |
|
https://colab.research.google.com/drive/1a_mBaqtufEDfJXQVokrap7D7Gmv5gtCK?usp=sharing |
|
|
|
## Install Prerequisite |
|
```bash |
|
!pip install -q git+https://github.com/huggingface/peft.git |
|
!pip install transformers |
|
!pip install -U accelerate |
|
!pip install accelerate |
|
!pip install bitsandbytes # Instal bits and bytes for inference of the model |
|
``` |
|
|
|
## Download Model |
|
```python |
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
peft_model_id = "danjie/Chadgpt-gpt2-xl" |
|
config = PeftConfig.from_pretrained(peft_model_id) |
|
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto') |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
|
|
# Load the Lora model |
|
model = PeftModel.from_pretrained(model, peft_model_id) |
|
``` |
|
|
|
## Inference |
|
```python |
|
def talk_with_llm(tweet: str) -> str: |
|
# Encode and move tensor into cuda if applicable. |
|
encoded_input = tokenizer(tweet, return_tensors='pt') |
|
encoded_input = {k: v.to("cuda") for k, v in encoded_input.items()} |
|
|
|
output = model.generate(**encoded_input, max_new_tokens=64) |
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
return response |
|
|
|
talk_with_llm("<User> Your sentence \n<Assistant>") |
|
``` |