nmitchko commited on
Commit
f76ff08
1 Parent(s): 60634a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md CHANGED
@@ -1,3 +1,97 @@
1
  ---
 
 
 
 
 
 
2
  license: cc-by-nc-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ library_name: peft
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - medical
8
  license: cc-by-nc-4.0
9
  ---
10
+
11
+ # MedFalcon v2 40b LoRA - Final
12
+
13
+ ![img.png](img.png)
14
+
15
+ ## Model Description
16
+
17
+ This a model release at `1 epoch`. For evaluation use only! Limitations:
18
+ * Do not use to treat paitients! Treat AI content as if you wrote it!!!
19
+
20
+ ### Architecture
21
+ `nmitchko/medfalcon-v2-40b-lora` is a large language model LoRa specifically fine-tuned for medical domain tasks.
22
+ It is based on [`Falcon-40b`](https://huggingface.co/tiiuae/falcon-40b) at 40 billion parameters.
23
+
24
+ The primary goal of this model is to improve question-answering and medical dialogue tasks.
25
+ It was trained using [LoRA](https://arxiv.org/abs/2106.09685), specifically [QLora](https://github.com/artidoro/qlora), to reduce memory footprint.
26
+
27
+ See Training Parameters for more info This Lora supports 4-bit and 8-bit modes.
28
+
29
+ ### Requirements
30
+
31
+ ```
32
+ bitsandbytes>=0.39.0
33
+ peft
34
+ transformers
35
+ ```
36
+
37
+ Steps to load this model:
38
+ 1. Load base model using transformers
39
+ 2. Apply LoRA using peft
40
+
41
+ ```python
42
+ #
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM
44
+ import transformers
45
+ import torch
46
+ from peft import PeftModel
47
+
48
+ model = "tiiuae/falcon-40b"
49
+ LoRA = "nmitchko/medfalcon-v2-40b-lora"
50
+
51
+ # If you want 8 or 4 bit set the appropriate flags
52
+ load_8bit = True
53
+
54
+ tokenizer = AutoTokenizer.from_pretrained(model)
55
+
56
+ model = AutoModelForCausalLM.from_pretrained(model,
57
+ load_in_8bit=load_8bit,
58
+ torch_dtype=torch.float16,
59
+ trust_remote_code=True,
60
+ )
61
+
62
+ model = PeftModel.from_pretrained(model, LoRA)
63
+
64
+ pipeline = transformers.pipeline(
65
+ "text-generation",
66
+ model=model,
67
+ tokenizer=tokenizer,
68
+ torch_dtype=torch.bfloat16,
69
+ trust_remote_code=True,
70
+ device_map="auto",
71
+ )
72
+
73
+ sequences = pipeline(
74
+ "What does the drug ceftrioxone do?\nDoctor:",
75
+ max_length=200,
76
+ do_sample=True,
77
+ top_k=40,
78
+ num_return_sequences=1,
79
+ eos_token_id=tokenizer.eos_token_id,
80
+ )
81
+
82
+ for seq in sequences:
83
+ print(f"Result: {seq['generated_text']}")
84
+ ```
85
+
86
+ ## Training Parameters
87
+
88
+ The model was trained for or 1 epoch on a custom, unreleased dataset named `medconcat`.
89
+ `medconcat` contains only human generated content and weighs in at over 100MiB of raw text.
90
+
91
+
92
+ | Item | Amount | Units |
93
+ |---------------|--------|-------|
94
+ | LoRA Rank | 64 | ~ |
95
+ | LoRA Alpha | 16 | ~ |
96
+ | Learning Rate | 1e-4 | SI |
97
+ | Dropout | 5 | % |