llama-2-7b-chat-MEDS-12

This is a llama-2-7b-chat-hf model fine-tuned using QLoRA (4-bit precision) on the s200862/medical_qa_meds dataset. This is an adapted version of the medalpaca/medical_meadow_wikidoc_patient_information dataset to match llama-2's instruction format.

πŸ”§ Training

It was trained on-premise in a jupyter notebook using an Nvidia RTX A4000 GPU with 16GB of VRAM and 16 GB of system RAM.

πŸ’» Usage

It is intended to give answers to medical questions.

# pip install transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "s200862/llama-2-7b-chat-MEDS-12"
prompt = "What causes Allergy?"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

sequences = pipeline(
    f'<s>[INST] {prompt} [/INST]',
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
Downloads last month
14
Safetensors
Model size
6.74B params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train s200862/llama-2-7b-chat-MEDS-12

Space using s200862/llama-2-7b-chat-MEDS-12 1