File size: 5,917 Bytes
e57c362
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d20ff6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a5743e5
 
0d20ff6
 
 
 
 
 
 
 
 
 
 
56c7554
 
 
 
 
 
 
a5743e5
56c7554
 
 
 
 
 
a5743e5
56c7554
 
 
 
a5743e5
 
56c7554
 
 
 
 
 
 
 
 
 
a5743e5
56c7554
 
 
 
 
 
 
 
 
 
 
 
 
a5743e5
56c7554
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
base_model: unsloth/mistral-7b-v0.3-bnb-4bit
---

# Uploaded  model

- **Developed by:** jingwang
- **License:** apache-2.0
- **Finetuned from model :** unsloth/mistral-7b-v0.3-bnb-4bit

This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.


# install dependencies in google colab

```shell
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
```

# inference
```python

from unsloth import FastLanguageModel
from typing import Dict, List, Tuple, Union, Any
import pandas
from tqdm import trange, tqdm
import torch

class FormatPrompt_QA_with_citation():
    '''format prompt class'''
    def __init__(self, eos_token:str='</s>') -> None:
        self.inputs = ['context','question'] # required input fields
        self.outputs = ['answer', 'citation'] #  for training, and model inference output fields
        self.eos_token = eos_token

    def __call__(self, instance: Dict[str, Any]) -> str:
        '''
        function call operator 
        Args:
            instance: dictionary with keys: 'question', 'answer'
        Returns:
            prompt: formatted prompt
        '''
        return self.formatting_prompt_func(instance)
    
    def formatting_prompt_func(self, instance: dict) -> str:
        '''format prompt for domain specific QA
        note this is for fine-tuning pre-trained model,
        if starting with instuct tuned model, use `tokenizer.apply_chat_template(messages)` instead
        '''

        assert all([ item in instance.keys()  for item in self.inputs ]), logging.info(f"instance must have {self.inputs}!")
        
        prompt = f"""<s> [INST] Context: {str(instance["context"])}\
        Question: {str(instance["question"])} 
        Answer: [/INST]"""

        if ('answer' in instance):
            if ('citation' in instance):
                answer = {"answer":str(instance['answer']), "citation":str(instance['citation'])}
            else:
                answer = {"answer":str(instance['answer']), "citation":""}
            prompt += json.dumps(answer, ensure_ascii=False) + self.eos_token # json format
        else:
            pass
        return prompt
```

```python

formatting_func = FormatPrompt_context_QA()

# pull model from huggingface
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "jingwang/mistral_qa_citation",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)


# inference
FastLanguageModel.for_inference(model)

example = {'context': 'John Gadsby Chapman , The Baptism of Pocahontas (1840). A copy is on display in the Rotunda of the United States Capitol . During her stay at Henricus, Pocahontas met John Rolfe. Rolfe\'s English-born wife Sarah Hacker and child Bermuda had died on the way to Virginia after the wreck of the ship Sea Venture on the Summer Isles, now known as Bermuda. He established the Virginia plantation Varina Farms , where he cultivated a new strain of tobacco . Rolfe was a pious man and agonized over the potential moral repercussions of marrying a heathen, though in fact Pocahontas had accepted the Christian faith and taken the baptismal name Rebecca. In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas, to whom my hearty and best thoughts are, and have been a long time so entangled, and enthralled in so intricate a labyrinth that I was even a-wearied to unwind myself thereout. [41] The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown. For two years they lived at Varina Farms, across the James River from Henricus. Their son Thomas was born in January 1615. [42] The marriage created a climate of peace between the Jamestown colonists and Powhatan\'s tribes; it endured for eight years as the "Peace of Pocahontas". [43] In 1615, Ralph Hamor wrote, "Since the wedding we have had friendly commerce and trade not only with Powhatan but also with his subjects round about us." [44] The marriage was controversial in the British court at the time because "a commoner" had "the audacity" to marry a "princess." [45] [46]',
  'question': 'Who did Pocahontas marry?',
  #'answer': 'Pocahontas married John Rolfe',
  #'citation': 'The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown.'
}



inputs = tokenizer([formatting_func(example)],  return_tensors="pt", padding=False).to(model.device)
input_length = inputs.input_ids.shape[-1]

with torch.no_grad():
  output = model.generate(**inputs,
                          do_sample=False,
                          temperature=0.5,
                          max_new_tokens=1024,
                          pad_token_id=tokenizer.eos_token_id,
                          use_cache=False,
                          )
  response = tokenizer.decode(
                  output[0][input_length::], # response only, remove prompts
                  skip_special_tokens=True,
                  )
  print(response)

```

```
>> {"answer": "Pocahontas married John Rolfe", "citation": "In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas"}

```