nagisa_bert / README.md
taishi-i's picture
add tokenizer information to README.md
b514ced
|
raw
history blame
2.73 kB
metadata
language: ja
license: mit
datasets:
  - wikipedia

nagisa_bert

A BERT model for nagisa. The model is available in Transformers πŸ€—.

A tokenizer for nagisa_bert is available here.

Install

To use this model, the following python library must be installed. You can install nagisa_bert by using the pip command.

Python 3.7+ on Linux or macOS is required.

$ pip install nagisa_bert

Usage

This model is available in Transformer's pipeline method.

>>> from transformers import pipeline
>>> from nagisa_bert import NagisaBertTokenizer

>>> text = "nagisaで[MASK]できるヒデルです"
>>> tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
>>> fill_mask = pipeline("fill-mask", model='taishi-i/nagisa_bert', tokenizer=tokenizer)
>>> print(fill_mask(text))
[{'score': 0.1385931372642517,
  'sequence': 'nagisa で 使用 できる ヒデル です',
  'token': 8092,
  'token_str': 'δ½Ώ 用'},
 {'score': 0.11947669088840485,
  'sequence': 'nagisa で εˆ©η”¨ できる ヒデル です',
  'token': 8252,
  'token_str': '利 用'},
 {'score': 0.04910655692219734,
  'sequence': 'nagisa で 作成 できる ヒデル です',
  'token': 9559,
  'token_str': '作 成'},
 {'score': 0.03792576864361763,
  'sequence': 'nagisa で θ³Όε…₯ できる ヒデル です',
  'token': 9430,
  'token_str': 'θ³Ό ε…₯'},
 {'score': 0.026893319562077522,
  'sequence': 'nagisa で ε…₯手 できる ヒデル です',
  'token': 11273,
  'token_str': 'ε…₯ 手'}]

Tokenization and vectorization.

>>> from transformers import BertModel
>>> from nagisa_bert import NagisaBertTokenizer

>>> text = "nagisaで[MASK]できるヒデルです"
>>> tokenizer = NagisaBertTokenizer.from_pretrained("taishi-i/nagisa_bert")
>>> tokens = tokenizer.tokenize(text)
>>> print(tokens)
['na', '##g', '##is', '##a', 'で', '[MASK]', 'できる', 'ヒデル', 'です']

>>> model = BertModel.from_pretrained("taishi-i/nagisa_bert")
>>> h = model(**tokenizer(text, return_tensors="pt")).last_hidden_state
>>> print(h)
tensor([[[-0.2912, -0.6818, -0.4097,  ...,  0.0262, -0.3845,  0.5816],
         [ 0.2504,  0.2143,  0.5809,  ..., -0.5428,  1.1805,  1.8701],
         [ 0.1890, -0.5816, -0.5469,  ..., -1.2081, -0.2341,  1.0215],
         ...,
         [-0.4360, -0.2546, -0.2824,  ...,  0.7420, -0.2904,  0.3070],
         [-0.6598, -0.7607,  0.0034,  ...,  0.2982,  0.5126,  1.1403],
         [-0.2505, -0.6574, -0.0523,  ...,  0.9082,  0.5851,  1.2625]]],
       grad_fn=<NativeLayerNormBackward0>)