--- library_name: peft base_model: facebook/mcontriever-msmarco language: - ko --- # smartPatent-mContriever-lora The model is fine-tuned on the customed Korean Patent Retrieval system. ### Training Data Two types of datasets are used as training data: queries automatically generated through GPT-4 and patent titles that are linked to existing patent abstracts. ### Usage ```python from transformers import AutoTokenizer, AutoModel, AutoModelForSequenceClassification import torch from transformers import AutoModel, AutoTokenizer from peft import PeftModel, PeftConfig def get_model(peft_model_name): config = PeftConfig.from_pretrained(peft_model_name) base_model = AutoModel.from_pretrained(config.base_model_name_or_path) model = PeftModel.from_pretrained(base_model, peft_model_name) model = model.merge_and_unload() model.eval() return model # Load the tokenizer and model tokenizer = AutoTokenizer.from_pretrained('facebook/mcontriever-msmarco') model = get_model('hanseokOh/smartPatent-mContriever-lora') ``` ### Info - **Developed by:** hanseokOh - **Model type:** information retriever - **Language(s) (NLP):** Korean - **Finetuned from model [optional]:** mContriever-msmarco ### Model Sources [optional] - **Repository:** https://github.com/hanseokOh/PatentSearch