SGEcon
/

EconFinKoSOLAR-10.7B_SFT

Inference Endpoints

Model card Files Files and versions Community

SGEcon commited on Feb 8, 2024

Commit

63bab77

·

verified ·

1 Parent(s): 758d33d

Update README.md

Files changed (1) hide show

README.md +16 -10

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ pipeline_tag: text-generation
 # Model Details
-Model Developers: Sogang University SGEconFinlab
 ### Model Description
@@ -37,7 +37,8 @@ If you wish to use the original data rather than our training data, please conta
     tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
     model.eval()
--------
     import re
     def gen(x):
         inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
@@ -78,8 +79,10 @@ If you wish to use the original data rather than our training data, please conta
 ## Training Details
 ### Training Data
 1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
@@ -98,13 +101,16 @@ If you wish to use the original data rather than our training data, please conta
 #### Training Hyperparameters
-- Lora
-1. r=16,
-    lora_alpha=16,
-    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"], # this is different by models
-    lora_dropout=0.05,
-    bias="none",
-    task_type="CAUSAL_LM"
 ## Evaluation

 # Model Details
+Model Developers: Sogang University SGEconFinlab(<<https://sc.sogang.ac.kr/aifinlab/>)
 ### Model Description
     tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
     model.eval()
+-----
     import re
     def gen(x):
         inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)
 ## Training Details
+ First, we loaded the base model quantized to 4 bits. It can significantly reduce the amount of memory required to store the model's weights and intermediate computation results, which is beneficial for deploying models in environments with limited memory resources. It can also provide faster inference speeds.
+ Then,
 ### Training Data
 1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
 #### Training Hyperparameters
+|Hyperparameter|SGEcon/KoSOLAR-10.7B-v0.2_fin_v4|
+|Lora Method|Lora|
+|load in 4 bit|True|
+|learning rate|1e-5|
+|lr scheduler|linear|
+|lora alpa|16|
+|lora rank|16|
+|lora dropout|0.05|
+|optim|paged_adamw_32bit|
+|target_modules||"q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"|
 ## Evaluation