noneUsername's picture
Update README.md
63759c9 verified
metadata
base_model:
  - TouchNight/Ministral-8B-Instruct-2410-HF

It is worth noting that compared with the prince-canuma version, this version is smaller in size after quantization and its accuracy is also improved by one percentage point.

In my ERP testing, this model did perform better.

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.820 ± 0.0243
strict-match 5 exact_match 0.816 ± 0.0246

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.804 ± 0.0252
strict-match 5 exact_match 0.804 ± 0.0252

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float32), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.820 ± 0.0243
strict-match 5 exact_match 0.816 ± 0.0246

vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.816 ± 0.0246
strict-match 5 exact_match 0.812 ± 0.0248

vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.796 ± 0.0255
strict-match 5 exact_match 0.792 ± 0.0257