Update README.md

758d33d verified about 1 year ago

4.87 kB

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: text-generation
	---


	# Model Details
	Model Developers: Sogang University SGEconFinlab


	### Model Description

	This model is a language model specialized in economics and finance. This was learned with various economic/finance-related data.
	The data sources are listed below, and we are not releasing the data we trained on because it was used for research/policy purposes.
	If you wish to use the original data rather than our training data, please contact the original author directly for permission to use it.

	- Developed by: Sogang University SGEconFinlab(<https://sc.sogang.ac.kr/aifinlab/>)
	- Language(s) (NLP): Ko/En
	- License: apache-2.0
	- Base Model: yanolja/KoSOLAR-10.7B-v0.2


	## How to Get Started with the Model


	peft_model_id = "SGEcon/KoSOLAR-10.7B-v0.2_fin_v4"
	config = PeftConfig.from_pretrained(peft_model_id)
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)
	model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, quantization_config=bnb_config, device_map={"":0})
	model = PeftModel.from_pretrained(model, peft_model_id)
	tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
	model.eval()

	-------
	import re
	def gen(x):
	inputs = tokenizer(f"### 질문: {x}\n\n### 답변:", return_tensors='pt', return_token_type_ids=False)

	# Move data to GPU (if available)
	inputs = {k: v.to(device="cuda" if torch.cuda.is_available() else "cpu") for k, v in inputs.items()}

	gened = model.generate(
	**inputs,
	max_new_tokens=256,
	early_stopping=True,
	num_return_sequences=4,
	do_sample=True,
	eos_token_id=tokenizer.eos_token_id,
	temperature=0.9,
	top_p=0.8,
	top_k=50
	)

	complete_answers = []
	for gen_seq in gened:
	decoded = tokenizer.decode(gen_seq, skip_special_tokens=True).strip()

	# Extract only the text after the string "### 답변:"
	first_answer_start_idx = decoded.find("### 답변:") + len("### 답변:")
	temp_answer = decoded[first_answer_start_idx:].strip()

	# Extract only text up to the second "### 답변:" string
	second_answer_start_idx = temp_answer.find("### 답변:")
	if second_answer_start_idx != -1:
	complete_answer = temp_answer[:second_answer_start_idx].strip()
	else:
	complete_answer = temp_answer # 두 번째 "### 답변:"이 없는 경우 전체 답변 반환

	complete_answers.append(complete_answer)

	return complete_answers


	## Training Details


	### Training Data

	1. 한국은행: 경제금융용어 700선(<https://www.bok.or.kr/portal/bbs/B0000249/view.do?nttId=235017&menuNo=200765>)
	2. 금융감독원: 금융소비자 정보 포털 파인 금융용어사전(<https://fine.fss.or.kr/fine/fnctip/fncDicary/list.do?menuNo=900021>)
	3. KDI 경제정보센터: 시사 용어사전(<https://eiec.kdi.re.kr/material/wordDic.do>)
	4. 한국경제신문/한경닷컴: 한경경제용어사전(<https://terms.naver.com/list.naver?cid=42107&categoryId=42107>), 오늘의 TESAT(<https://www.tesat.or.kr/bbs.frm.list/tesat_study?s_cateno=1>), 오늘의 주니어 TESAT(<https://www.tesat.or.kr/bbs.frm.list/tesat_study?s_cateno=5>), 생글생글한경(<https://sgsg.hankyung.com/tesat/study>)
	5. 중소벤처기업부/대한민국정부: 중소벤처기업부 전문용어(<https://terms.naver.com/list.naver?cid=42103&categoryId=42103>)
	6. 고성삼/법문출판사: 회계·세무 용어사전(<https://terms.naver.com/list.naver?cid=51737&categoryId=51737>)
	7. 맨큐의 경제학 8판 Word Index


	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->


	#### Training Hyperparameters

	- Lora
	1. r=16,
	lora_alpha=16,
	target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"], # this is different by models
	lora_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM"

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Dataset Card if possible. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary


	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->