SGEcon commited on
Commit
63bab77
·
verified ·
1 Parent(s): 758d33d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -10
README.md CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
6
 
7
 
8
  # Model Details
9
- Model Developers: Sogang University SGEconFinlab
10
 
11
 
12
  ### Model Description
@@ -37,7 +37,8 @@ If you wish to use the original data rather than our training data, please conta
37
  tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
38
  model.eval()
39
 
40
- -------
 
41
  import re
42
  def gen(x):
43
  inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
@@ -78,8 +79,10 @@ If you wish to use the original data rather than our training data, please conta
78
 
79
 
80
  ## Training Details
 
 
81
 
82
-
83
  ### Training Data
84
 
85
  1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
@@ -98,13 +101,16 @@ If you wish to use the original data rather than our training data, please conta
98
 
99
  #### Training Hyperparameters
100
 
101
- - Lora
102
- 1. r=16,
103
- lora_alpha=16,
104
- target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"], # this is different by models
105
- lora_dropout=0.05,
106
- bias="none",
107
- task_type="CAUSAL_LM"
 
 
 
108
 
109
  ## Evaluation
110
 
 
6
 
7
 
8
  # Model Details
9
+ Model Developers: Sogang University SGEconFinlab(<<https://sc.sogang.ac.kr/aifinlab/>)
10
 
11
 
12
  ### Model Description
 
37
  tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
38
  model.eval()
39
 
40
+
41
+ -----
42
  import re
43
  def gen(x):
44
  inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
 
79
 
80
 
81
  ## Training Details
82
+ First, we loaded the base model quantized to 4 bits. It can significantly reduce the amount of memory required to store the model's weights and intermediate computation results, which is beneficial for deploying models in environments with limited memory resources. It can also provide faster inference speeds.
83
+ Then,
84
 
85
+
86
  ### Training Data
87
 
88
  1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
 
101
 
102
  #### Training Hyperparameters
103
 
104
+ |Hyperparameter|SGEcon/KoSOLAR-10.7B-v0.2_fin_v4|
105
+ |Lora Method|Lora|
106
+ |load in 4 bit|True|
107
+ |learning rate|1e-5|
108
+ |lr scheduler|linear|
109
+ |lora alpa|16|
110
+ |lora rank|16|
111
+ |lora dropout|0.05|
112
+ |optim|paged_adamw_32bit|
113
+ |target_modules||"q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"|
114
 
115
  ## Evaluation
116