metadata
license: bigscience-bloom-rail-1.0
tags:
- text generation
- generated_from_trainer
- email generation
- email
- emailgen
datasets:
- aeslc
- postbot/multi-emails-100k
widget:
- text: >-
Good Morning Professor Beans,
Hope you are doing well. I just wanted to reach out and ask if
differential calculus will be on the exam
example_title: email to prof
- text: |-
嘿<NAME>
感谢你注册我的每周通讯。在我们开始之前,你必须确认你的电子邮件地址。.
example_title: 通讯
- text: >-
Hi <NAME>,
I hope this email finds you well. I wanted to reach out and ask about
office hours
example_title: office hours
- text: >-
Grüße <NAME>,
Ich hoffe, du hattest einen schönen Abend beim Wurstessen der Firma. Ich
melde mich, weil
example_title: Wurstessen festival
- text: |-
Guten Morgen Harold,
ich habe mich gefragt, wann die nächste
example_title: event
- text: URGENT - I need the TPS reports
example_title: URGENT
- text: |-
Hoi Archibald,
ik hoop dat deze e-mail je goed doet.
example_title: e-mails die je vinden
- text: |-
Hello there.
I just wanted to reach out and check in to
example_title: checking in
- text: >-
Hello <NAME>,
I hope this email finds you well. I wanted to reach out and see if you've
enjoyed your time with us
example_title: work well
- text: >-
Hi <NAME>,
I hope this email finds you well. I wanted to reach out and see if we
could catch up
example_title: catch up
- text: >-
Jestem <NAME>,
Właśnie wprowadziłem się do obszaru i chciałem dotrzeć i uzyskać kilka
szczegółów na temat tego, gdzie mogę dostać artykuły spożywcze i
example_title: zakupy spożywcze
parameters:
min_length: 32
max_length: 128
no_repeat_ngram_size: 2
do_sample: true
temperature: 0.2
top_k: 20
top_p: 0.95
repetition_penalty: 3.5
length_penalty: 0.9
bloom-1b1-emailgen - v1
This model is a fine-tuned version of bigscience/bloom-1b1 on the postbot/multi-emails-100k
dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7397
Model description
More information needed
Intended uses & limitations
⚠️ this model did not have any of the original layers frozen during training ⚠️
- while this is still an area of investigation, the model likely needs to have some layers frozen during fine-tuning to retain the multilingual capabilities in balance with learning how to write emails.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 64
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.8465 | 1.0 | 256 | 1.8656 |
1.4903 | 2.0 | 512 | 1.7396 |
details
***** eval metrics *****
epoch = 2.0
eval_loss = 1.7397
eval_runtime = 0:04:27.41
eval_samples = 4216
eval_samples_per_second = 15.766
eval_steps_per_second = 15.766
perplexity = 5.6956
Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1