|
--- |
|
title: web-md-llama2-7b-3000 |
|
tags: |
|
- healthcare |
|
- NLP |
|
- dialogues |
|
- LLM |
|
- fine-tuned |
|
license: unknown |
|
datasets: |
|
- Kabatubare/medical-guanaco-3000 |
|
--- |
|
|
|
# Medical3000 Model Card |
|
|
|
This is a model card for web-md-llama2-7b-3000 , a fine-tuned version of Llama-2-7B, specifically aimed at medical dialogues. |
|
|
|
Covered areas: |
|
|
|
General Medicine: Basic medical advice, symptoms, general treatments. |
|
|
|
Cardiology: Questions related to heart diseases, blood circulation. |
|
|
|
Neurology: Topics around brain health, neurological disorders. |
|
|
|
Gastroenterology: Issues related to the digestive system. |
|
|
|
Oncology: Questions about different types of cancers, treatments. |
|
|
|
Endocrinology: Topics related to hormones, diabetes, thyroid. |
|
|
|
Orthopedics: Bone health, joint issues. |
|
|
|
Pediatrics: Child health, vaccinations, growth and development. |
|
|
|
Mental Health: Depression, anxiety, stress, and other mental health issues. |
|
|
|
Women's Health: Pregnancy, menstrual health, menopause. |
|
|
|
## Model Details |
|
|
|
### Base Model |
|
|
|
- **Name**: Llama-2-7B |
|
|
|
### Fine-tuned Model |
|
|
|
- **Name**: web-md-llama2-7b-3000 |
|
- **Fine-tuned on**: Kabatubare/medical-guanaco-3000 |
|
- **Description**: This model is fine-tuned to specialize in medical dialogues and healthcare applications. |
|
|
|
### Architecture and Training Parameters |
|
|
|
#### Architecture |
|
|
|
- **LoRA Attention Dimension**: 64 |
|
- **LoRA Alpha Parameter**: 16 |
|
- **LoRA Dropout**: 0.1 |
|
- **Precision**: 4-bit (bitsandbytes) |
|
- **Quantization Type**: nf4 |
|
|
|
#### Training Parameters |
|
|
|
- **Epochs**: 3 |
|
- **Batch Size**: 4 |
|
- **Gradient Accumulation Steps**: 1 |
|
- **Max Gradient Norm**: 0.3 |
|
- **Learning Rate**: 3e-4 |
|
- **Weight Decay**: 0.001 |
|
- **Optimizer**: paged_adamw_32bit |
|
- **LR Scheduler**: cosine |
|
- **Warmup Ratio**: 0.03 |
|
- **Logging Steps**: 25 |
|
|
|
## Datasets |
|
|
|
### |
|
|
|
### Fine-tuning Dataset |
|
|
|
- **Name**: Kabatubare/medical-guanaco-3000 |
|
- **Description**: This is a reduced and balanced dataset curated from a larger medical dialogue dataset using derived from 24,000 WebMD question and answer dialogue sessions . It aims to cover a broad range of medical topics and is suitable for training healthcare chatbots and conducting medical NLP research. |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Yo!Medical3000") |
|
model = AutoModelForCausalLM.from_pretrained("Yo!Medical3000") |
|
|
|
# Use the model for inference |
|
|