File size: 1,123 Bytes
e301d78
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
language:
- en
---

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model_path = "reciprocate/mistral-7b-rm"
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
reward_fn = pipeline("text-classification", model=model, tokenizer=tokenizer, truncation=True, batch_size=8, max_length=4096, device=0)

chats = [[
    {"role": "user", "content": "When was the battle at Waterloo?"},
    {"role": "assistant", "content": "I think it was in 1983, but please double-check that when you have a chance."}
], [
    {"role": "user", "content": "When was the battle at Waterloo?"},
    {"role": "assistant", "content": "The battle at Waterloo took place on June 18, 1815."}
]]

output = reward_fn([tokenizer.apply_chat_template(chat, tokenize=False) for chat in chats])
scores = [x["score"] for x in output]
scores
```
```
>>> [0.2586347758769989, 0.6663259267807007]
```

```python
# optionally normalize with the mean and std computed on the training data
scores = (np.array(scores) - 2.01098) / 1.69077
```