stockmark/gpt-neox-japanese-1.4b
This repository provides a GPT-NeoX based model with 1.4B parameters pre-trained on Japanese corpus of about 20B tokens. This model is developed by Stockmark Inc.
How to use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Use torch.bfloat16 for A100 GPU and torch.flaot16 for the older generation GPUs
torch_dtype = torch.bfloat16 if torch.cuda.is_available() and hasattr(torch.cuda, "is_bf16_supported") and torch.cuda.is_bf16_supported() else torch.float16
model = AutoModelForCausalLM.from_pretrained("stockmark/gpt-neox-japanese-1.4b", device_map="auto", torch_dtype=torch_dtype)
tokenizer = AutoTokenizer.from_pretrained("stockmark/gpt-neox-japanese-1.4b")
inputs = tokenizer("自然言語処理は", return_tensors="pt").to(model.device)
with torch.no_grad():
tokens = model.generate(
**inputs,
max_new_tokens=128,
repetition_penalty=1.1
)
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)
Example:
- LoRA tuning: https://huggingface.co./stockmark/gpt-neox-japanese-1.4b/blob/main/notebooks/LoRA.ipynb
Training dataset
- Japanese Web Corpus (ja): 8.6B tokens (This dataset will not be released.)
- Wikipedia (ja): 0.88B tokens
- CC100 (ja): 10.5B tokens
Training setting
- Trained using HuggingFace Trainer and DeepSpeed (ZeRO-2)
- 8 A100 GPUs (40GB) at ABCI
- Mixed Precision (BF16)
License
Developed by
Author
- Downloads last month
- 1,334
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.