ce-lery commited on
Commit
ddd7d53
·
verified ·
1 Parent(s): 6e76adb

feat: add how to

Browse files
Files changed (1) hide show
  1. README.md +60 -4
README.md CHANGED
@@ -6,19 +6,72 @@ tags:
6
  model-index:
7
  - name: trial2
8
  results: []
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
13
 
14
 
15
- ## Model description
16
 
17
- More information needed
 
 
 
 
 
 
 
18
 
19
- ## Intended uses & limitations
 
 
20
 
21
  More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ## Training and evaluation data
24
 
@@ -26,6 +79,8 @@ More information needed
26
 
27
  ## Training procedure
28
 
 
 
29
  ### Training hyperparameters
30
 
31
  The following hyperparameters were used during training:
@@ -42,6 +97,7 @@ The following hyperparameters were used during training:
42
 
43
  ### Training results
44
 
 
45
 
46
 
47
  ### Framework versions
@@ -49,4 +105,4 @@ The following hyperparameters were used during training:
49
  - Transformers 4.46.2
50
  - Pytorch 2.4.0a0+f70bd71a48.nv24.06
51
  - Datasets 2.20.0
52
- - Tokenizers 0.20.3
 
6
  model-index:
7
  - name: trial2
8
  results: []
9
+ license: apache-2.0
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
 
16
+ ## mistral-2b-base
17
 
18
+ Welcome to my model card!
19
+
20
+ This Model feature is ...
21
+
22
+ - trained by japanese
23
+ - trained in two stages: patch level and token level
24
+ - Suppression of unknown word generation by using byte fallback in SentencePiece tokenizer and conversion to huggingface Tokenizers format
25
+ - Use of Mistral 2B
26
 
27
+ Yukkuri shite ittene!
28
+
29
+ <!-- ## Intended uses & limitations
30
 
31
  More information needed
32
+ -->
33
+
34
+ ## How to use the model
35
+
36
+ ```python
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
38
+ import torch
39
+
40
+ MODEL_NAME = "ce-lery/mistral-2b-base"
41
+ torch.set_float32_matmul_precision('high')
42
+
43
+ device = "cuda"
44
+ if (device != "cuda" and device != "cpu"):
45
+ device = "cpu"
46
+
47
+ tokenizer = AutoTokenizer.from_pretrained(model_path,use_fast=False)
48
+ model = AutoModelForCausalLM.from_pretrained(model_path,
49
+ trust_remote_code=True,
50
+ ).to(device)
51
+
52
+
53
+ prompt = "自然言語処理とは、"
54
+
55
+ inputs = tokenizer(prompt, add_special_tokens=True,return_tensors="pt").to(model.device)
56
+ with torch.no_grad():
57
+
58
+ outputs = model.generate(
59
+ inputs["input_ids"],
60
+ max_new_tokens=4096,
61
+ do_sample=True,
62
+ early_stopping=False,
63
+ top_p=0.95,
64
+ top_k=50,
65
+ temperature=0.7,
66
+ no_repeat_ngram_size=2,
67
+ num_beams=3
68
+ )
69
+
70
+ print(outputs.tolist()[0])
71
+ outputs_txt = tokenizer.decode(outputs[0])
72
+ print(outputs_txt)
73
+
74
+ ```
75
 
76
  ## Training and evaluation data
77
 
 
79
 
80
  ## Training procedure
81
 
82
+
83
+
84
  ### Training hyperparameters
85
 
86
  The following hyperparameters were used during training:
 
97
 
98
  ### Training results
99
 
100
+ Please refer [here]().
101
 
102
 
103
  ### Framework versions
 
105
  - Transformers 4.46.2
106
  - Pytorch 2.4.0a0+f70bd71a48.nv24.06
107
  - Datasets 2.20.0
108
+ - Tokenizers 0.20.3