Update README.md
Browse files
README.md
CHANGED
@@ -9,24 +9,33 @@ pipeline_tag: text-generation
|
|
9 |
base_model: Intel/neural-chat-7b-v3-1
|
10 |
---
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
16 |
|
17 |
-
|
18 |
-
## Model Details
|
19 |
-
|
20 |
### Model Description
|
21 |
|
22 |
<!-- Provide a longer summary of what this model is. -->
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
- **Developed by:** [More Information Needed]
|
27 |
-
- **Funded by [optional]:** [More Information Needed]
|
28 |
-
- **Shared by [optional]:** [More Information Needed]
|
29 |
-
- **Model type:** [More Information Needed]
|
30 |
- **Language(s) (NLP):** [English]
|
31 |
- **License:** [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/)
|
32 |
- **Finetuned from model [optional]:** [Intel/neural-chat-7b-v3-1](https://huggingface.co/Intel/neural-chat-7b-v3-1)
|
@@ -35,9 +44,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
35 |
|
36 |
<!-- Provide the basic links for the model. -->
|
37 |
|
38 |
-
- **Repository:** [More Information Needed]
|
39 |
- **Paper [optional]:** [InMD-X](http://arxiv.org/abs/2402.11883)
|
40 |
-
- **Demo [optional]:** [More Information Needed]
|
41 |
|
42 |
## Uses
|
43 |
```python
|
@@ -53,146 +60,86 @@ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
|
53 |
# Load the Lora model
|
54 |
model = PeftModel.from_pretrained(model, peft_model_id)
|
55 |
|
56 |
-
|
57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
```
|
59 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
60 |
-
|
61 |
-
### Experimental setup
|
62 |
-
- **Ubuntu 22.04.3 LTS**
|
63 |
-
- **GPU - NVIDIA A100(40GB)**
|
64 |
-
- **Python**: 3.10.12
|
65 |
-
- **Pytorch**:2.1.1+cu118
|
66 |
-
- **Transformer**:4.37.0.dev0
|
67 |
-
|
68 |
-
|
69 |
-
## How to Get Started with the Model
|
70 |
-
|
71 |
-
Use the code below to get started with the model.
|
72 |
-
|
73 |
-
[More Information Needed]
|
74 |
-
|
75 |
-
## Training Details
|
76 |
-
|
77 |
-
### Training Data
|
78 |
-
|
79 |
-
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
80 |
-
|
81 |
-
[More Information Needed]
|
82 |
-
|
83 |
-
### Training Procedure
|
84 |
-
|
85 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
86 |
-
|
87 |
-
#### Preprocessing [optional]
|
88 |
-
|
89 |
-
[More Information Needed]
|
90 |
-
|
91 |
-
|
92 |
-
#### Training Hyperparameters
|
93 |
-
|
94 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
95 |
-
|
96 |
-
#### Speeds, Sizes, Times [optional]
|
97 |
-
|
98 |
-
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
99 |
-
|
100 |
-
[More Information Needed]
|
101 |
-
|
102 |
-
## Evaluation
|
103 |
-
|
104 |
-
<!-- This section describes the evaluation protocols and provides the results. -->
|
105 |
-
|
106 |
-
### Testing Data, Factors & Metrics
|
107 |
|
108 |
-
#### Testing Data
|
109 |
|
110 |
-
|
|
|
111 |
|
112 |
-
|
|
|
|
|
|
|
|
|
|
|
113 |
|
114 |
-
#### Factors
|
115 |
|
116 |
-
|
|
|
117 |
|
118 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
119 |
|
120 |
-
#### Metrics
|
121 |
|
122 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
123 |
|
124 |
-
|
125 |
-
|
126 |
-
### Results
|
127 |
-
|
128 |
-
[More Information Needed]
|
129 |
-
|
130 |
-
#### Summary
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
## Model Examination [optional]
|
135 |
-
|
136 |
-
<!-- Relevant interpretability work for the model goes here -->
|
137 |
-
|
138 |
-
[More Information Needed]
|
139 |
-
|
140 |
-
## Environmental Impact
|
141 |
-
|
142 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
143 |
-
|
144 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
145 |
-
|
146 |
-
- **Hardware Type:** [More Information Needed]
|
147 |
-
- **Hours used:** [More Information Needed]
|
148 |
-
- **Cloud Provider:** [More Information Needed]
|
149 |
-
- **Compute Region:** [More Information Needed]
|
150 |
-
- **Carbon Emitted:** [More Information Needed]
|
151 |
-
|
152 |
-
## Technical Specifications [optional]
|
153 |
-
|
154 |
-
### Model Architecture and Objective
|
155 |
-
|
156 |
-
[More Information Needed]
|
157 |
-
|
158 |
-
### Compute Infrastructure
|
159 |
|
160 |
-
|
|
|
|
|
|
|
|
|
|
|
161 |
|
162 |
-
#### Hardware
|
163 |
|
164 |
-
[More Information Needed]
|
165 |
|
166 |
-
|
167 |
|
168 |
-
|
|
|
169 |
|
170 |
-
##
|
|
|
171 |
|
172 |
-
<!--
|
173 |
|
174 |
**BibTeX:**
|
|
|
175 |
|
176 |
-
[More Information Needed]
|
177 |
-
|
178 |
-
**APA:**
|
179 |
-
|
180 |
-
[More Information Needed]
|
181 |
-
|
182 |
-
## Glossary [optional]
|
183 |
-
|
184 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
185 |
-
|
186 |
-
[More Information Needed]
|
187 |
-
|
188 |
-
## More Information [optional]
|
189 |
-
|
190 |
-
[More Information Needed]
|
191 |
-
|
192 |
-
## Model Card Authors [optional]
|
193 |
-
|
194 |
-
[More Information Needed]
|
195 |
-
|
196 |
-
## Model Card Contact
|
197 |
|
198 |
-
|
|
|
|
9 |
base_model: Intel/neural-chat-7b-v3-1
|
10 |
---
|
11 |
|
12 |
+
## InMD-X: Large Language Models for Internal Medicine Doctors
|
13 |
+
We introduce InMD-X, a collection of
|
14 |
+
multiple large language models specifically designed
|
15 |
+
to cater to the unique characteristics and demands
|
16 |
+
of Internal Medicine Doctors (IMD). InMD-X represents
|
17 |
+
a groundbreaking development in natural language
|
18 |
+
processing, offering a suite of language models
|
19 |
+
fine-tuned for various aspects of the internal medicine
|
20 |
+
field. These models encompass a wide range of medical
|
21 |
+
sub-specialties, enabling IMDs to perform more
|
22 |
+
efficient and accurate research, diagnosis, and documentation.
|
23 |
+
InMD-X’s versatility and adaptability
|
24 |
+
make it a valuable tool for improving the healthcare
|
25 |
+
industry, enhancing communication between healthcare
|
26 |
+
professionals, and advancing medical research.
|
27 |
+
Each model within InMD-X is meticulously tailored
|
28 |
+
to address specific challenges faced by IMDs, ensuring
|
29 |
+
the highest level of precision and comprehensiveness
|
30 |
+
in clinical text analysis and decision support.
|
31 |
|
|
|
32 |
|
33 |
|
|
|
|
|
|
|
34 |
### Model Description
|
35 |
|
36 |
<!-- Provide a longer summary of what this model is. -->
|
37 |
|
38 |
+
- **Model type:** [CausalLM]
|
|
|
|
|
|
|
|
|
|
|
39 |
- **Language(s) (NLP):** [English]
|
40 |
- **License:** [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/)
|
41 |
- **Finetuned from model [optional]:** [Intel/neural-chat-7b-v3-1](https://huggingface.co/Intel/neural-chat-7b-v3-1)
|
|
|
44 |
|
45 |
<!-- Provide the basic links for the model. -->
|
46 |
|
|
|
47 |
- **Paper [optional]:** [InMD-X](http://arxiv.org/abs/2402.11883)
|
|
|
48 |
|
49 |
## Uses
|
50 |
```python
|
|
|
60 |
# Load the Lora model
|
61 |
model = PeftModel.from_pretrained(model, peft_model_id)
|
62 |
|
63 |
+
pipeline = transformers.pipeline(
|
64 |
+
"text-generation",
|
65 |
+
model=model,
|
66 |
+
tokenizer = tokenizer,
|
67 |
+
device_map="auto" # if you have GPU
|
68 |
+
)
|
69 |
+
|
70 |
+
def inference(pipeline, Qustion,answer_only = False):
|
71 |
+
sequences = pipeline("Answer the next question in one sentence.\n"+
|
72 |
+
Qustion,
|
73 |
+
do_sample=True,
|
74 |
+
top_k=10,
|
75 |
+
top_p = 0.9,
|
76 |
+
temperature = 0.2,
|
77 |
+
num_return_sequences=1,
|
78 |
+
eos_token_id=tokenizer.eos_token_id,
|
79 |
+
max_length=500, # can increase the length of sequence
|
80 |
+
)
|
81 |
+
|
82 |
+
Answers = []
|
83 |
+
for seq in sequences:
|
84 |
+
|
85 |
+
Answer = seq['generated_text'].split(Qustion)[-1].replace("\n","")
|
86 |
+
Answers.append(Answer)
|
87 |
+
return Answers
|
88 |
+
|
89 |
+
question = 'What is the association between long-term beta-blocker use after myocardial infarction (MI) and the risk of reinfarction and death?'
|
90 |
+
answers = inference(pipeline, question)
|
91 |
+
print(answers)
|
92 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
|
|
|
94 |
|
95 |
+
### List of LoRA config
|
96 |
+
based on [Parameter-Efficient Fine-Tuning (PEFT)](https://github.com/huggingface/peft)
|
97 |
|
98 |
+
Parameter | PT | SFT
|
99 |
+
:------:| :------:| :------:
|
100 |
+
r | 8 | 8
|
101 |
+
lora alpha | 32 | 32
|
102 |
+
lora dropout | 0.05 | 0.05
|
103 |
+
target | q, k, v, o,up, down, gate | q, k, v, o,up,down, gate
|
104 |
|
|
|
105 |
|
106 |
+
### List of Training arguments
|
107 |
+
based on [Transformer Reinforcement Learning (TRL)](https://github.com/huggingface/trl)
|
108 |
|
109 |
+
Parameter | PT | SFT
|
110 |
+
:------:| :------:| :------:
|
111 |
+
train epochs | 3 | 1
|
112 |
+
per device train batch size | 1 | 1
|
113 |
+
optimizer | adamw_hf | adamw_hf
|
114 |
+
evaluation strategy | no | no
|
115 |
+
learning_rate | 1e-4 | 1e-4
|
116 |
|
|
|
117 |
|
|
|
118 |
|
119 |
+
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
120 |
|
121 |
+
### Experimental setup
|
122 |
+
- **Ubuntu 22.04.3 LTS**
|
123 |
+
- **GPU - NVIDIA A100(40GB)**
|
124 |
+
- **Python**: 3.10.12
|
125 |
+
- **Pytorch**:2.1.1+cu118
|
126 |
+
- **Transformer**:4.37.0.dev0
|
127 |
|
|
|
128 |
|
|
|
129 |
|
130 |
+
## Limitations
|
131 |
|
132 |
+
InMD-X consists of a collection of segmented models. The integration of the models has not yet been fully accomplished, resulting in each model being fragmented.
|
133 |
+
Due to the absence of benchmarks, the segmented models have not been adequately evaluated. Future research will involve the development of new benchmarks and the integration of models to facilitate an objective evaluation.
|
134 |
|
135 |
+
## Non-commercial use
|
136 |
+
These models are available exclusively for research purposes and are not intended for commercial use.
|
137 |
|
138 |
+
<!-- ## Citation
|
139 |
|
140 |
**BibTeX:**
|
141 |
+
-->
|
142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
143 |
|
144 |
+
## INMED DATA
|
145 |
+
INMED DATA is developing large language models (LLMs) specifically tailored for medical applications. For more information, please visit our website [TBD].
|