metadata

library_name: transformers
license: apache-2.0
pipeline_tag: text-generation

Model Details

Model Developers: Sogang University SGEconFinlab

Model Description

This model is a language model specialized in economics and finance. This was learned with various economic/finance-related data. The data sources are listed below, and we are not releasing the data we trained on because it was used for research/policy purposes. If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it.

Developed by: Sogang University SGEconFinlab(https://sc.sogang.ac.kr/aifinlab/)
Language(s) (NLP): Ko/En
License: apache-2.0
Base Model: yanolja/KoSOLAR-10.7B-v0.2

How to Get Started with the Model

peft_model_id = "SGEcon/KoSOLAR-10.7B-v0.2_fin_v4"
config = PeftConfig.from_pretrained(peft_model_id)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, device_map={"":0})
model = PeftModel.from_pretrained(model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model.eval()

import re
def gen(x):
    inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)

    # Move data to GPU (if available)
    inputs = {k: v.to(device="cuda" if torch.cuda.is_available() else "cpu") for k, v in inputs.items()}

    gened = model.generate(
        **inputs,
        max_new_tokens=256,
        early_stopping=True,
        num_return_sequences=4,  
        do_sample=True,
        eos_token_id=tokenizer.eos_token_id,  
        temperature=0.9,
        top_p=0.8,
        top_k=50
    )

    complete_answers = []
    for gen_seq in gened:
        decoded = tokenizer.decode(gen_seq, skip_special_tokens=True).strip()

        # Extract only the text after the string "### 답변:"
        first_answer_start_idx = decoded.find("### 답변:") + len("### 답변:")
        temp_answer = decoded[first_answer_start_idx:].strip()

        # Extract only text up to the second "### 답변:" string
        second_answer_start_idx = temp_answer.find("### 답변:")
        if second_answer_start_idx != -1:
            complete_answer = temp_answer[:second_answer_start_idx].strip()
        else:
            complete_answer = temp_answer  # 두 번째 "### 답변:"이 없는 경우 전체 답변 반환
    
        complete_answers.append(complete_answer)

    return complete_answers

Training Details

Training Data

한국은행: 경제금융용어 700선(https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765)
금융감독원: 금융소비자 정보 포털 파인 금융용어사전(https://fine.fss.or.kr/fine/fnctip/fncDicary/list.do?menuNo=900021)
KDI 경제정보센터: 시사 용어사전(https://eiec.kdi.re.kr/material/wordDic.do)
한국경제신문/한경닷컴: 한경경제용어사전(https://terms.naver.com/list.naver?cid=42107&categoryId=42107), 오늘의 TESAT(https://www.tesat.or.kr/bbs.frm.list/tesat_study?s_cateno=1), 오늘의 주니어 TESAT(https://www.tesat.or.kr/bbs.frm.list/tesat_study?s_cateno=5), 생글생글한경(https://sgsg.hankyung.com/tesat/study)
중소벤처기업부/대한민국정부: 중소벤처기업부 전문용어(https://terms.naver.com/list.naver?cid=42103&categoryId=42103)
고성삼/법문출판사: 회계·세무 용어사전(https://terms.naver.com/list.naver?cid=51737&categoryId=51737)
맨큐의 경제학 8판 Word Index

Training Procedure

Training Hyperparameters

Lora

r=16, lora_alpha=16, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"], # this is different by models lora_dropout=0.05, bias="none", task_type="CAUSAL_LM"

SGEcon
/

EconFinKoSOLAR-10.7B_SFT

Model Details

Model Description

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Training Hyperparameters

Evaluation

Testing Data, Factors & Metrics

Testing Data

Results

Summary

Citation [optional]