metadata
tags:
- generated_from_trainer
model-index:
- name: mpt-mini-shakespeare
results: []
mpt-mini-shakespeare
This model was trained from scratch on https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt.
Model description
The configuration and code is adapted from mosaicml/mpt-7b-storywriter, with configuration parameters changed to make it a very tiny model.
Intended uses & limitations
Intended just to aid debugging efforts of a GGML port of mpt-7b-storywriter.
Training and evaluation data
More information needed
Training procedure
Just use the single tinyshakespeare text file as both training and validation set (splitting into paragraphs).
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 1
Training results
Mediocre, as expected.
Framework versions
- Transformers 4.28.0
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3