BePilot: An AI Programming Assistant for Compiler Backend Development
This project provides the fine-tuned BePilot-1.5B/7B based on QWen2.5-Coder-1.5B/7B aiming for enhancing manual compiler backend development efficiency.
1. Dependency
conda env create -f bepilot.yml
2. Fine-Tuning Process
We performed LoRA fine-tuning on QWen2.5-Coder-1.5B/7B using the train and validation set in ComBack++.
3. Inferencing
Our lora fine-tuned BePilot-1.5B is saved in ./BePilot-1.5B
and BePilot-7B is saved in ./BePilot-7B
.
Please specify the parameter and batch size for inferencing in ./run_inference.sh
.
Run following command for inferencing:
bash ./run_inference.sh
The BePilot-generated code will be saved in ./Res
.
The accuracy of BePilot-1.5B/7B on the test set of ComBack++ surpasses that of some mainstream LLMs for code with similar parameter counts.
GCC
Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Code. Gen. Code. Gen. Code. Gen. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. C++ C++ C++ MD MD MD C++ C++ C++ MD MD MD C++ C++ C++ C++ C++ C++ MD MD MD Model EM ED R-L EM ED R-L EM ED R-L EM ED R-L CB ED R-L EM ED R-L EM ED R-L DeepSeekCoder -1.3B 0 5.44 0.01 0 5.01 0 0 7.57 0.01 0 7.28 0.01 19.57 26.61 0.06 0 13.96 0.05 0 11.77 0.03 YiCoder-1.5B 0 1.03 0.00 0 1.50 0.01 0 1.63 0.01 0 2.38 0.01 23.22 5.58 0.01 0 1.63 0.01 0 1.63 0.01 QWen2.5-Coder -1.5B 0 3.76 0.02 0 5.50 0.03 0.04 4.11 0.02 0 6.20 0.03 25.74 19.68 0.04 0 7.36 0.02 0 9.03 0.02 CodeLLaMA-7B 0 0.03 0.00 0 0 0 0 2.62 0.01 0 0.95 0 0.25 0.48 0 0 0.04 0 0 0.21 0 CodeGemma-7B 0 5.36 0.03 0 5.81 0.02 0 9.45 0.02 0 8.72 0.01 25.74 26.89 0.08 0 18.84 0.09 0 18.58 0.10 QWen2.5-Coder -7B 0 10.06 0.03 0 16.41 0.07 0 18.08 0.06 0 22.01 0.13 22.96 27.89 0.07 0 18.63 0.04 0 16.94 0.03 BePilot-1.5B 57.55 77.98 0.87 74.03 88.45 0.94 60.11 74.24 0.76 58.70 84.81 0.88 40.09 48.77 0.50 29.10 71.62 0.74 29.31 78.77 0.80 BePilot-7B 69.26 84.78 0.88 78.07 90.62 0.94 68.82 80.45 0.81 68.58 88.35 0.89 51.97 59.50 0.60 35.77 73.21 0.77 35.46 80.98 0.83 LLVM
Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Stmt.Comp. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Next. Sugg. Code. Gen. Code. Gen. Code. Gen. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. Pro. Rep. C++ C++ C++ TD TD TD C++ C++ C++ TD TD TD C++ C++ C++ C++ C++ C++ TD TD TD Model EM ED R-L EM ED R-L EM ED R-L EM ED R-L CB ED R-L EM ED R-L EM ED R-L DeepSeekCoder -1.3B 0 5.03 0 0 8.48 0.01 0 7.45 0.01 0 7.99 0.01 22.36 25.22 0.06 0 14.47 0.05 0 13.23 0.06 YiCoder-1.5B 0 1.56 0.01 0 6.09 0.03 0 2.93 0.01 0 5.21 0.03 23.69 6.59 0.01 0 2.08 0.01 0 0.94 0.01 QWen2.5-Coder -1.5B 0 3.59 0.02 0 7.52 0.05 0.02 4.57 0.01 0 5.50 0.03 27.17 18.96 0.04 0 7.79 0.02 0 11.33 0.05 CodeLLaMA-7B 0 0 0 0 0.01 0 0 2.85 0.01 0 5.14 0.04 0.27 1.52 0.02 0 0.08 0 0 0.02 0 CodeGemma-7B 0 5.02 0.02 0 7.93 0.03 0 10.72 0.02 0 11.20 0.01 29.37 25.35 0.08 0 20.18 0.08 0 18.79 0.11 QWen2.5-Coder -7B 0 8.75 0.02 0 17.84 0.06 0 19.16 0.07 0 27.45 0.20 25.20 28.80 0.08 0 20.11 0.05 0 22.68 0.08 BePilot-1.5B 62.11 80.79 0.90 60.46 83.81 0.89 57.39 73.16 0.75 51.30 70.60 0.76 41.12 49.25 0.49 26.64 71.63 0.76 7.02 74.44 0.74 BePilot-7B 73.90 87.53 0.91 68.09 86.89 0.90 64.69 78.34 0.79 57.51 75.97 0.81 47.08 55.18 0.57 33.19 71.68 0.78 7.23 70.72 0.73