--- library_name: peft base_model: KT-AI/midm-bitext-S-7B-inst-v1 --- # Model Card for Model ID ## Model Details ### 네이버 영화 리뷰텍스트(NSMC)데이터셋을 프롬포트에 포함하여 모델에 입력하면 -- "긍정" 또는 "부정" 이라고 예측하는 텍스트 생성하는 것이 목표 #### 실험내용: train dataset의 2100개 샘플,valid dataset의 1000개 샘플을 미세튜닝에 사용 - 일반적으로 1900스텝에서는 정확도 accuracy가 80후반대(약 85%)가 도출, 2000스텝이상부터 90%에 근접한 수치를 보였다. - seq length를 312로 줄인 결과, seq length 384보다 훈련시간trainer.train이 적게 걸리지만 정확도도 감소 - gradient_accumulation steps을 2로 설정하여 미니배치를 통해 구해진 gradient값을 n step동안 global gradient에 누적시킨 후 한번에 업뎃->배치를 여러개 사용한 효과를 주는 등 노력함. ##Accuracy 정확도 분석 ###valid_dataset(test dataset 1000개에 대한 정확도) ********************************* | | TP | TN | |:-------------:|:-----:|:----:| | PP | 438 | 70 | | PN | 29 | 463 | |Accuracy | - |0.901 ********************************* ***정확도:0.901 ********************************* ### Model Description - **Developed by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed] ## Training procedure The following `bitsandbytes` quantization config was used during training: - quant_method: bitsandbytes - load_in_8bit: False - load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: nf4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: bfloat16 ### Framework versions - PEFT 0.7.1 - PEFT 0.7.0