feat: add how to
Browse files
README.md
CHANGED
@@ -6,19 +6,72 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: trial2
|
8 |
results: []
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
should probably proofread and complete it, then remove this comment. -->
|
13 |
|
14 |
|
15 |
-
##
|
16 |
|
17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
|
|
|
|
|
20 |
|
21 |
More information needed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
## Training and evaluation data
|
24 |
|
@@ -26,6 +79,8 @@ More information needed
|
|
26 |
|
27 |
## Training procedure
|
28 |
|
|
|
|
|
29 |
### Training hyperparameters
|
30 |
|
31 |
The following hyperparameters were used during training:
|
@@ -42,6 +97,7 @@ The following hyperparameters were used during training:
|
|
42 |
|
43 |
### Training results
|
44 |
|
|
|
45 |
|
46 |
|
47 |
### Framework versions
|
@@ -49,4 +105,4 @@ The following hyperparameters were used during training:
|
|
49 |
- Transformers 4.46.2
|
50 |
- Pytorch 2.4.0a0+f70bd71a48.nv24.06
|
51 |
- Datasets 2.20.0
|
52 |
-
- Tokenizers 0.20.3
|
|
|
6 |
model-index:
|
7 |
- name: trial2
|
8 |
results: []
|
9 |
+
license: apache-2.0
|
10 |
---
|
11 |
|
12 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
|
16 |
+
## mistral-2b-base
|
17 |
|
18 |
+
Welcome to my model card!
|
19 |
+
|
20 |
+
This Model feature is ...
|
21 |
+
|
22 |
+
- trained by japanese
|
23 |
+
- trained in two stages: patch level and token level
|
24 |
+
- Suppression of unknown word generation by using byte fallback in SentencePiece tokenizer and conversion to huggingface Tokenizers format
|
25 |
+
- Use of Mistral 2B
|
26 |
|
27 |
+
Yukkuri shite ittene!
|
28 |
+
|
29 |
+
<!-- ## Intended uses & limitations
|
30 |
|
31 |
More information needed
|
32 |
+
-->
|
33 |
+
|
34 |
+
## How to use the model
|
35 |
+
|
36 |
+
```python
|
37 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
|
38 |
+
import torch
|
39 |
+
|
40 |
+
MODEL_NAME = "ce-lery/mistral-2b-base"
|
41 |
+
torch.set_float32_matmul_precision('high')
|
42 |
+
|
43 |
+
device = "cuda"
|
44 |
+
if (device != "cuda" and device != "cpu"):
|
45 |
+
device = "cpu"
|
46 |
+
|
47 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path,use_fast=False)
|
48 |
+
model = AutoModelForCausalLM.from_pretrained(model_path,
|
49 |
+
trust_remote_code=True,
|
50 |
+
).to(device)
|
51 |
+
|
52 |
+
|
53 |
+
prompt = "自然言語処理とは、"
|
54 |
+
|
55 |
+
inputs = tokenizer(prompt, add_special_tokens=True,return_tensors="pt").to(model.device)
|
56 |
+
with torch.no_grad():
|
57 |
+
|
58 |
+
outputs = model.generate(
|
59 |
+
inputs["input_ids"],
|
60 |
+
max_new_tokens=4096,
|
61 |
+
do_sample=True,
|
62 |
+
early_stopping=False,
|
63 |
+
top_p=0.95,
|
64 |
+
top_k=50,
|
65 |
+
temperature=0.7,
|
66 |
+
no_repeat_ngram_size=2,
|
67 |
+
num_beams=3
|
68 |
+
)
|
69 |
+
|
70 |
+
print(outputs.tolist()[0])
|
71 |
+
outputs_txt = tokenizer.decode(outputs[0])
|
72 |
+
print(outputs_txt)
|
73 |
+
|
74 |
+
```
|
75 |
|
76 |
## Training and evaluation data
|
77 |
|
|
|
79 |
|
80 |
## Training procedure
|
81 |
|
82 |
+
|
83 |
+
|
84 |
### Training hyperparameters
|
85 |
|
86 |
The following hyperparameters were used during training:
|
|
|
97 |
|
98 |
### Training results
|
99 |
|
100 |
+
Please refer [here]().
|
101 |
|
102 |
|
103 |
### Framework versions
|
|
|
105 |
- Transformers 4.46.2
|
106 |
- Pytorch 2.4.0a0+f70bd71a48.nv24.06
|
107 |
- Datasets 2.20.0
|
108 |
+
- Tokenizers 0.20.3
|