szürkemarha-mistral v1
Ez az első (teszt) verziója egy magyar nyelvű instrukciókövető modellnek.
Használat
Ebben a repoban van egy app.py
script, ami egy gradio felületet csinál a kényelmesebb használathoz.
Vagy kódból valahogy így:
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
BASE_MODEL = "mistralai/Mistral-7B-v0.1"
LORA_WEIGHTS = "boapps/szurkemarha-mistral"
device = "cuda"
try:
if torch.backends.mps.is_available():
device = "mps"
except:
pass
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, quantization_config=nf4_config)
model = PeftModel.from_pretrained(
model, LORA_WEIGHTS, torch_dtype=torch.float16, force_download=True
)
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Melyik megyében található az alábbi város?
### Input:
Pécs
### Response:"""
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].to(device)
generation_config = GenerationConfig(
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
)
with torch.no_grad():
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=256,
)
s = generation_output.sequences[0]
output = tokenizer.decode(s)
print(output.split("### Response:")[1].strip())
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.