license: llama3.1 | |
# Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines | |
This repository contains the models and datasets used in the paper *"Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"*. | |
## Models | |
The `ckpt` folder contains 16 LoRA adapters that were fine-tuned for this research: | |
- 6 Basic Executors | |
- 3 Executor Composers | |
- 7 Aligners | |
The base model used for fine-tuning all of the above is [LLaMA 3.1-8B](https://huggingface.co./meta-llama/Llama-3.1-8B). | |
## Datasets | |
The datasets used for evaluating all models can be found in the `datasets/raw` folder. | |
## Usage | |
Please refer to [GitHub page](https://github.com/NJUDeepEngine/CAEF) for details. | |
## Citation | |
If you use CAEF for your research, please cite our [paper](https://arxiv.org/abs/2410.07896): | |
```bibtex | |
@misc{lai2024executing, | |
title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines}, | |
author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu}, | |
year={2024}, | |
eprint={2410.07896}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.AI}, | |
url={https://arxiv.org/abs/2410.07896}, | |
} | |
``` |