Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
|
|
6 |
|
7 |
|
8 |
# Model Details
|
9 |
-
Model Developers: Sogang University SGEconFinlab
|
10 |
|
11 |
|
12 |
### Model Description
|
@@ -37,7 +37,8 @@ If you wish to use the original data rather than our training data, please conta
|
|
37 |
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
38 |
model.eval()
|
39 |
|
40 |
-
|
|
|
41 |
import re
|
42 |
def gen(x):
|
43 |
inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
|
@@ -78,8 +79,10 @@ If you wish to use the original data rather than our training data, please conta
|
|
78 |
|
79 |
|
80 |
## Training Details
|
|
|
|
|
81 |
|
82 |
-
|
83 |
### Training Data
|
84 |
|
85 |
1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
|
@@ -98,13 +101,16 @@ If you wish to use the original data rather than our training data, please conta
|
|
98 |
|
99 |
#### Training Hyperparameters
|
100 |
|
101 |
-
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
|
|
|
|
|
|
108 |
|
109 |
## Evaluation
|
110 |
|
|
|
6 |
|
7 |
|
8 |
# Model Details
|
9 |
+
Model Developers: Sogang University SGEconFinlab(<<https://sc.sogang.ac.kr/aifinlab/>)
|
10 |
|
11 |
|
12 |
### Model Description
|
|
|
37 |
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
38 |
model.eval()
|
39 |
|
40 |
+
|
41 |
+
-----
|
42 |
import re
|
43 |
def gen(x):
|
44 |
inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
|
|
|
79 |
|
80 |
|
81 |
## Training Details
|
82 |
+
First, we loaded the base model quantized to 4 bits. It can significantly reduce the amount of memory required to store the model's weights and intermediate computation results, which is beneficial for deploying models in environments with limited memory resources. It can also provide faster inference speeds.
|
83 |
+
Then,
|
84 |
|
85 |
+
|
86 |
### Training Data
|
87 |
|
88 |
1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
|
|
|
101 |
|
102 |
#### Training Hyperparameters
|
103 |
|
104 |
+
|Hyperparameter|SGEcon/KoSOLAR-10.7B-v0.2_fin_v4|
|
105 |
+
|Lora Method|Lora|
|
106 |
+
|load in 4 bit|True|
|
107 |
+
|learning rate|1e-5|
|
108 |
+
|lr scheduler|linear|
|
109 |
+
|lora alpa|16|
|
110 |
+
|lora rank|16|
|
111 |
+
|lora dropout|0.05|
|
112 |
+
|optim|paged_adamw_32bit|
|
113 |
+
|target_modules||"q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"|
|
114 |
|
115 |
## Evaluation
|
116 |
|