lomahony's picture
Update README.md
321b97f verified
|
raw
history blame
4.62 kB
metadata
language:
  - en
tags:
  - pytorch
  - causal-lm
  - pythia
license: apache-2.0
datasets:
  - Anthropic/hh-rlhf

Pythia-1b supervised finetuned using TRLx library with the helpful subset of Anthropic-hh-rlhf dataset for 1 epoch.

Checkpoints are also uploaded.

Fully reproducible finetuning code is available on GitHub

wandb log

See Pythia-1b for model details (paper).

See further details of these models in the paper Attributing Mode Collapse in the Fine-Tuning of Large Language Models.

You can cite these models if they are helpful as follows:

@inproceedings{o2024attributing,
  title={Attributing Mode Collapse in the Fine-Tuning of Large Language Models},
  author={O’Mahony, Laura and Grinsztajn, Leo and Schoelkopf, Hailey and Biderman, Stella},
  booktitle={ICLR 2024, Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) workshop},
  year={2024}
}

hf (pretrained=lomahony/pythia-1b-helpful-sft), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: 16

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.2543 ± 0.0127
none 0 acc_norm 0.2739 ± 0.0130
arc_easy 1 none 0 acc 0.5724 ± 0.0102
none 0 acc_norm 0.4941 ± 0.0103
boolq 2 none 0 acc 0.6199 ± 0.0085
hellaswag 1 none 0 acc 0.3819 ± 0.0048
none 0 acc_norm 0.4736 ± 0.0050
lambada_openai 1 none 0 perplexity 7.1374 ± 0.2014
none 0 acc 0.5626 ± 0.0069
openbookqa 1 none 0 acc 0.2040 ± 0.0180
none 0 acc_norm 0.3140 ± 0.0208
piqa 1 none 0 acc 0.7138 ± 0.0105
none 0 acc_norm 0.6997 ± 0.0107
sciq 1 none 0 acc 0.8400 ± 0.0116
none 0 acc_norm 0.7620 ± 0.0135
wikitext 2 none 0 word_perplexity 16.9719 ± N/A
none 0 byte_perplexity 1.6981 ± N/A
none 0 bits_per_byte 0.7639 ± N/A
winogrande 1 none 0 acc 0.5343 ± 0.0140

hf (pretrained=lomahony/pythia-1b-helpful-sft), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 16

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 5 acc 0.2628 ± 0.0129
none 5 acc_norm 0.2918 ± 0.0133
arc_easy 1 none 5 acc 0.6040 ± 0.0100
none 5 acc_norm 0.5816 ± 0.0101
boolq 2 none 5 acc 0.5963 ± 0.0086
hellaswag 1 none 5 acc 0.3780 ± 0.0048
none 5 acc_norm 0.4719 ± 0.0050
lambada_openai 1 none 5 perplexity 10.2584 ± 0.2936
none 5 acc 0.4832 ± 0.0070
openbookqa 1 none 5 acc 0.1980 ± 0.0178
none 5 acc_norm 0.3220 ± 0.0209
piqa 1 none 5 acc 0.7057 ± 0.0106
none 5 acc_norm 0.7095 ± 0.0106
sciq 1 none 5 acc 0.8980 ± 0.0096
none 5 acc_norm 0.9000 ± 0.0095
wikitext 2 none 5 word_perplexity 16.9719 ± N/A
none 5 byte_perplexity 1.6981 ± N/A
none 5 bits_per_byte 0.7639 ± N/A
winogrande 1 none 5 acc 0.5446 ± 0.0140