File size: 2,230 Bytes

bc2e15e
94fdf3e
 
 
 
 
 
 
bc2e15e
 
 
94fdf3e
bc2e15e
f149d38
 
 
bc2e15e
94fdf3e
bc2e15e
94fdf3e
bc2e15e
94fdf3e
bc2e15e
94fdf3e
bc2e15e
94fdf3e
bc2e15e
f149d38
bc2e15e
94fdf3e
bc2e15e
f149d38
 
 
 
6f7b55d
f149d38
 
 
 
 
 
44a7440
6f7b55d
f149d38
 
 
6f7b55d
 
 
f149d38
94fdf3e
bc2e15e
94fdf3e
 
 
 
 
 
 
 
 
 
 
bc2e15e
94fdf3e
bc2e15e
f149d38
bc2e15e
f149d38
bc2e15e
94fdf3e
bc2e15e
94fdf3e

---
license: bigscience-bloom-rail-1.0
base_model: bigscience/bloom-1b7
tags:
- generated_from_trainer
model-index:
- name: Bloom-1b7-creative-writing-IT
  results: []
---


# Bloom-1b7-creative-writing-IT

This model is a fine-tuned version of [bigscience/bloom-1b7](https://huggingface.co./bigscience/bloom-1b7) on an a creative writing - short story dataset.

https://huggingface.co./datasets/adambjorn/UnrelatedForgettingOverhead/viewer/creative

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

Training and evaluation data here: https://huggingface.co./datasets/adambjorn/UnrelatedForgettingOverhead/viewer/creative

## Training procedure

The model was instruction tuned on the dataset in the following way:

Given the set of promts: 

``` python
prompts = [
    "Write a creative short story based on the following title:",
    "Here is a title for a story. Craft a short narrative around it:",
    "Using the title given, develop a short story:",
    "Imagine a short story that starts with this title:",
    "Create a brief story with the following title:"
]
```

each training example is generated by concatenating one of the prompts with the 'title' and 'selftext' in the following way:

``` python
concatenated_texts = [random.choice(prompts) + " " + title + "</s>" + "Story: " + selftext for title, selftext in zip(titles, selftexts)]
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP

### Training results

Final reported loss: {'loss': 0.0135, 'grad_norm': 0.6041152477264404, 'learning_rate': 7.446808510638299e-07, 'epoch': 9.89}

Average over tuning: {'train_runtime': 1111.4187, 'train_samples_per_second': 1.71, 'train_steps_per_second': 0.423, 'train_loss': 0.4682149670225509, 'epoch': 9.89}

### Framework versions

- Transformers 4.38.1
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2