taishi-i commited on
Commit
3b4a75e
1 Parent(s): d4bfeca

fix README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -20
README.md CHANGED
@@ -21,7 +21,7 @@ Python 3.7+ on Linux or macOS is required.
21
 
22
 
23
  ```bash
24
- $ pip install nagisa_bert
25
  ```
26
 
27
  ## Usage
@@ -29,13 +29,16 @@ $ pip install nagisa_bert
29
  This model is available in Transformer's pipeline method.
30
 
31
  ```python
32
- >>> from transformers import pipeline
33
- >>> from nagisa_bert import NagisaBertTokenizer
34
 
35
- >>> text = "nagisaで[MASK]できるモデルです"
36
- >>> tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
37
- >>> fill_mask = pipeline("fill-mask", model='taishi-i/nagisa_bert', tokenizer=tokenizer)
38
- >>> print(fill_mask(text))
 
 
 
39
  [{'score': 0.1385931372642517,
40
  'sequence': 'nagisa で 使用 できる モデル です',
41
  'token': 8092,
@@ -61,18 +64,21 @@ This model is available in Transformer's pipeline method.
61
  Tokenization and vectorization.
62
 
63
  ```python
64
- >>> from transformers import BertModel
65
- >>> from nagisa_bert import NagisaBertTokenizer
66
-
67
- >>> text = "nagisaで[MASK]できるモデルです"
68
- >>> tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
69
- >>> tokens = tokenizer.tokenize(text)
70
- >>> print(tokens)
71
- ['na', '##g', '##is', '##a', 'で', '[MASK]', 'できる', 'モデル', 'です']
72
-
73
- >>> model = BertModel.from_pretrained("taishi-i/nagisa_bert")
74
- >>> h = model(**tokenizer(text, return_tensors="pt")).last_hidden_state
75
- >>> print(h)
 
 
 
76
  tensor([[[-0.2912, -0.6818, -0.4097, ..., 0.0262, -0.3845, 0.5816],
77
  [ 0.2504, 0.2143, 0.5809, ..., -0.5428, 1.1805, 1.8701],
78
  [ 0.1890, -0.5816, -0.5469, ..., -1.2081, -0.2341, 1.0215],
@@ -108,4 +114,4 @@ You can find here a list of the notebooks on Japanese NLP using pre-trained mode
108
  | [Feature-extraction](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/feature_extraction-japanese_bert_models.ipynb) | How to use the pipeline function in transformers to extract features from Japanese text. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/feature_extraction-japanese_bert_models.ipynb)|
109
  | [Embedding visualization](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/embedding_visualization-japanese_bert_models.ipynb) | Show how to visualize embeddings from Japanese pre-trained models. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/embedding_visualization_japanese_bert_models.ipynb)|
110
  | [How to fine-tune a model on text classification](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-amazon_reviews_ja.ipynb) | Show how to fine-tune a pretrained model on a Japanese text classification task. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-amazon_reviews_ja.ipynb)|
111
- | [How to fine-tune a model on text classification with csv files](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-csv_files.ipynb) | Show how to preprocess the data and fine-tune a pretrained model on a Japanese text classification task. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-csv_files.ipynb)|
 
21
 
22
 
23
  ```bash
24
+ pip install nagisa_bert
25
  ```
26
 
27
  ## Usage
 
29
  This model is available in Transformer's pipeline method.
30
 
31
  ```python
32
+ from transformers import pipeline
33
+ from nagisa_bert import NagisaBertTokenizer
34
 
35
+ text = "nagisaで[MASK]できるモデルです"
36
+ tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
37
+ fill_mask = pipeline("fill-mask", model='taishi-i/nagisa_bert', tokenizer=tokenizer)
38
+ print(fill_mask(text))
39
+ ```
40
+
41
+ ```python
42
  [{'score': 0.1385931372642517,
43
  'sequence': 'nagisa で 使用 できる モデル です',
44
  'token': 8092,
 
64
  Tokenization and vectorization.
65
 
66
  ```python
67
+ from transformers import BertModel
68
+ from nagisa_bert import NagisaBertTokenizer
69
+
70
+ text = "nagisaで[MASK]できるモデルです"
71
+ tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
72
+ tokens = tokenizer.tokenize(text)
73
+ print(tokens)
74
+ # ['na', '##g', '##is', '##a', 'で', '[MASK]', 'できる', 'モデル', 'です']
75
+
76
+ model = BertModel.from_pretrained("taishi-i/nagisa_bert")
77
+ h = model(**tokenizer(text, return_tensors="pt")).last_hidden_state
78
+ print(h)
79
+ ```
80
+
81
+ ```python
82
  tensor([[[-0.2912, -0.6818, -0.4097, ..., 0.0262, -0.3845, 0.5816],
83
  [ 0.2504, 0.2143, 0.5809, ..., -0.5428, 1.1805, 1.8701],
84
  [ 0.1890, -0.5816, -0.5469, ..., -1.2081, -0.2341, 1.0215],
 
114
  | [Feature-extraction](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/feature_extraction-japanese_bert_models.ipynb) | How to use the pipeline function in transformers to extract features from Japanese text. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/feature_extraction-japanese_bert_models.ipynb)|
115
  | [Embedding visualization](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/embedding_visualization-japanese_bert_models.ipynb) | Show how to visualize embeddings from Japanese pre-trained models. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/embedding_visualization_japanese_bert_models.ipynb)|
116
  | [How to fine-tune a model on text classification](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-amazon_reviews_ja.ipynb) | Show how to fine-tune a pretrained model on a Japanese text classification task. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-amazon_reviews_ja.ipynb)|
117
+ | [How to fine-tune a model on text classification with csv files](https://github.com/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-csv_files.ipynb) | Show how to preprocess the data and fine-tune a pretrained model on a Japanese text classification task. |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/taishi-i/nagisa_bert/blob/develop/notebooks/text_classification-csv_files.ipynb)|