fblgit's picture
Upload folder using huggingface_hub
e56d9fd verified
hf (dtype=bfloat16,use_cache=True,pretrained=./checkpoint-1400/,max_length=2048), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|-------------------------------------|-------|------|-----:|--------|---|-----:|---|-----:|
|leaderboard_gpqa | N/A| | | | | | | |
| - leaderboard_gpqa_diamond | 1|none | 0|acc_norm||0.3030|± |0.0327|
| - leaderboard_gpqa_extended | 1|none | 0|acc_norm||0.3004|± |0.0196|
| - leaderboard_gpqa_main | 1|none | 0|acc_norm||0.2969|± |0.0216|
|leaderboard_musr | N/A| | | | | | | |
| - leaderboard_musr_murder_mysteries | 1|none | 0|acc_norm||0.5400|± |0.0316|
| - leaderboard_musr_object_placements| 1|none | 0|acc_norm||0.3203|± |0.0292|
| - leaderboard_musr_team_allocation | 1|none | 0|acc_norm||0.4080|± |0.0311|
hf (dtype=bfloat16,use_cache=True,pretrained=./checkpoint-1400/,max_length=768), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 128
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 5|exact_match||0.5974|± |0.0135|
| | |strict-match | 5|exact_match||0.5921|± |0.0135|