File size: 8,834 Bytes
578a2e6 ab7a191 578a2e6 7dcd27c 673525f 9279c3a 673525f ab7a191 7dcd27c ab7a191 7dcd27c f77c6a5 ab7a191 f77c6a5 b7d61af f77c6a5 7dcd27c e79a3da 7dcd27c e79a3da 7dcd27c ab7a191 1f7a3d3 ab7a191 dc2636c f77c6a5 dc2636c f77c6a5 dc2636c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
---
license: llama2
metrics:
- code_eval
library_name: transformers
tags:
- code
model-index:
- name: WizardCoder-Python-34B-V1.0
results:
- task:
type: text-generation
dataset:
type: openai_humaneval
name: HumanEval
metrics:
- name: pass@1
type: pass@1
value: 0.732
verified: false
---
<p align="center">
π€ <a href="https://huggingface.co./WizardLM" target="_blank">HF Repo</a> β’π± <a href="https://github.com/nlpxucan/WizardLM" target="_blank">Github Repo</a> β’ π¦ <a href="https://twitter.com/WizardLM_AI" target="_blank">Twitter</a> β’ π <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> β’ π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> β’ π <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a> <br>
</p>
<p align="center">
π Join our <a href="https://discord.gg/VZjjHtWrKs" target="_blank">Discord</a>
</p>
## News
- π₯π₯π₯[2023/08/26] We released **WizardCoder-Python-34B-V1.0** , which achieves the **73.2 pass@1** and surpasses **GPT4 (2023/03/15)**, **ChatGPT-3.5**, and **Claude2** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
- [2023/06/16] We released **WizardCoder-15B-V1.0** , which achieves the **57.3 pass@1** and surpasses **Claude-Plus (+6.8)**, **Bard (+15.3)** and **InstructCodeT5+ (+22.3)** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
βNote: There are two HumanEval results of GPT4 and ChatGPT-3.5. The 67.0 and 48.1 are reported by the official GPT4 Report (2023/03/15) of [OpenAI](https://arxiv.org/abs/2303.08774). The 82.0 and 72.5 are tested by ourselves with the latest API (2023/08/26).
| Model | Checkpoint | Paper | HumanEval | MBPP | Demo | License |
| ----- |------| ---- |------|-------| ----- | ----- |
| WizardCoder-Python-34B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardCoder-Python-34B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 73.2 | 61.2 | [Demo](http://47.103.63.15:50085/) | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
| WizardCoder-15B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardCoder-15B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 59.8 |50.6 | -- | <a href="https://huggingface.co./spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
| WizardCoder-Python-13B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardCoder-Python-13B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 64.0 | 55.6 | -- | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
| WizardCoder-3B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardCoder-3B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 34.8 |37.4 | [Demo](http://47.103.63.15:50086/) | <a href="https://huggingface.co./spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
| WizardCoder-1B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardCoder-1B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 23.8 |28.6 | -- | <a href="https://huggingface.co./spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
- Our **WizardMath-70B-V1.0** model slightly outperforms some closed-source LLMs on the GSM8K, including **ChatGPT 3.5**, **Claude Instant 1** and **PaLM 2 540B**.
- Our **WizardMath-70B-V1.0** model achieves **81.6 pass@1** on the [GSM8k Benchmarks](https://github.com/openai/grade-school-math), which is **24.8** points higher than the SOTA open-source LLM, and achieves **22.7 pass@1** on the [MATH Benchmarks](https://github.com/hendrycks/math), which is **9.2** points higher than the SOTA open-source LLM.
<font size=4>
| Model | Checkpoint | Paper | GSM8k | MATH |Online Demo| License|
| ----- |------| ---- |------|-------| ----- | ----- |
| WizardMath-70B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardMath-70B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **81.6** | **22.7** |[Demo](http://47.103.63.15:50083/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a> |
| WizardMath-13B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardMath-13B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **63.9** | **14.0** |[Demo](http://47.103.63.15:50082/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a> |
| WizardMath-7B-V1.0 | π€ <a href="https://huggingface.co./WizardLM/WizardMath-7B-V1.0" target="_blank">HF Link</a> | π <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **54.9** | **10.7** | [Demo ](http://47.103.63.15:50080/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a>|
</font>
- [08/09/2023] We released **WizardLM-70B-V1.0** model. Here is [Full Model Weight](https://huggingface.co./WizardLM/WizardLM-70B-V1.0).
<font size=4>
| <sup>Model</sup> | <sup>Checkpoint</sup> | <sup>Paper</sup> |<sup>MT-Bench</sup> | <sup>AlpacaEval</sup> | <sup>GSM8k</sup> | <sup>HumanEval</sup> | <sup>License</sup>|
| ----- |------| ---- |------|-------| ----- | ----- | ----- |
| <sup>**WizardLM-70B-V1.0**</sup> | <sup>π€ <a href="https://huggingface.co./WizardLM/WizardLM-70B-V1.0" target="_blank">HF Link</a> </sup>|<sup>π**Coming Soon**</sup>| <sup>**7.78**</sup> | <sup>**92.91%**</sup> |<sup>**77.6%**</sup> | <sup> **50.6**</sup>|<sup> <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 License </a></sup> |
| <sup>WizardLM-13B-V1.2</sup> | <sup>π€ <a href="https://huggingface.co./WizardLM/WizardLM-13B-V1.2" target="_blank">HF Link</a> </sup>| | <sup>7.06</sup> | <sup>89.17%</sup> |<sup>55.3%</sup> | <sup>36.6 </sup>|<sup> <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 License </a></sup> |
| <sup>WizardLM-13B-V1.1</sup> |<sup> π€ <a href="https://huggingface.co./WizardLM/WizardLM-13B-V1.1" target="_blank">HF Link</a> </sup> | | <sup>6.76</sup> |<sup>86.32%</sup> | | <sup>25.0 </sup>| <sup>Non-commercial</sup>|
| <sup>WizardLM-30B-V1.0</sup> | <sup>π€ <a href="https://huggingface.co./WizardLM/WizardLM-30B-V1.0" target="_blank">HF Link</a></sup> | | <sup>7.01</sup> | | | <sup>37.8 </sup>| <sup>Non-commercial</sup> |
| <sup>WizardLM-13B-V1.0</sup> | <sup>π€ <a href="https://huggingface.co./WizardLM/WizardLM-13B-V1.0" target="_blank">HF Link</a> </sup> | | <sup>6.35</sup> | <sup>75.31%</sup> | | <sup> 24.0 </sup> | <sup>Non-commercial</sup>|
| <sup>WizardLM-7B-V1.0 </sup>| <sup>π€ <a href="https://huggingface.co./WizardLM/WizardLM-7B-V1.0" target="_blank">HF Link</a> </sup> |<sup> π <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> </sup>| | | |<sup>19.1 </sup>|<sup> Non-commercial</sup>|
</font>
## Comparing WizardCoder-Python-34B-V1.0 with Other LLMs.
π₯ The following figure shows that our **WizardCoder-Python-34B-V1.0 attains the second position in this benchmark**, surpassing GPT4 (2023/03/15, 73.2 vs. 67.0), ChatGPT-3.5 (73.2 vs. 72.5) and Claude2 (73.2 vs. 71.2).
<p align="center" width="100%">
<a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/WizardCoder/imgs/compare_sota.png" alt="WizardCoder" style="width: 96%; min-width: 300px; display: block; margin: auto;"></a>
</p>
## Prompt Format
```
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:
```
## Inference Demo Script
We provide the inference demo code [here](https://github.com/nlpxucan/WizardLM/tree/main/demo).
## Citation
Please cite the repo if you use the data or code in this repo.
```
@misc{luo2023wizardcoder,
title={WizardCoder: Empowering Code Large Language Models with Evol-Instruct},
author={Ziyang Luo and Can Xu and Pu Zhao and Qingfeng Sun and Xiubo Geng and Wenxiang Hu and Chongyang Tao and Jing Ma and Qingwei Lin and Daxin Jiang},
year={2023},
}
``` |