weqweasdas
commited on
Commit
•
fc1057c
1
Parent(s):
028c23e
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,6 @@
|
|
8 |
|
9 |
The reward model is trained from the base model [google/gemma-7b-it](https://huggingface.co/google/gemma-7b-it).
|
10 |
|
11 |
-
The training process is identical to [RM-Gemma-7B](https://huggingface.co/weqweasdas/RM-Gemma-7B) but with a max-length of 4096 thanks to more GPU resources.
|
12 |
|
13 |
## Model Details
|
14 |
|
@@ -48,11 +47,11 @@ We train the model for one epoch with a learning rate of 5e-6, batch size 256, c
|
|
48 |
|
49 |
```python
|
50 |
from transformers import AutoTokenizer, pipeline
|
51 |
-
rm_tokenizer = AutoTokenizer.from_pretrained("weqweasdas/RM-Gemma-7B
|
52 |
device = 0 # accelerator.device
|
53 |
rm_pipe = pipeline(
|
54 |
"sentiment-analysis",
|
55 |
-
model="weqweasdas/RM-Gemma-7B
|
56 |
#device="auto",
|
57 |
device=device,
|
58 |
tokenizer=rm_tokenizer,
|
|
|
8 |
|
9 |
The reward model is trained from the base model [google/gemma-7b-it](https://huggingface.co/google/gemma-7b-it).
|
10 |
|
|
|
11 |
|
12 |
## Model Details
|
13 |
|
|
|
47 |
|
48 |
```python
|
49 |
from transformers import AutoTokenizer, pipeline
|
50 |
+
rm_tokenizer = AutoTokenizer.from_pretrained("weqweasdas/RM-Gemma-7B")
|
51 |
device = 0 # accelerator.device
|
52 |
rm_pipe = pipeline(
|
53 |
"sentiment-analysis",
|
54 |
+
model="weqweasdas/RM-Gemma-7B",
|
55 |
#device="auto",
|
56 |
device=device,
|
57 |
tokenizer=rm_tokenizer,
|