File size: 1,247 Bytes
eeed20a dc809cc c300c12 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
license: llama3.1
---
# Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines
This repository contains the models and datasets used in the paper *"Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"*.
## Models
The `ckpt` folder contains 16 LoRA adapters that were fine-tuned for this research:
- 6 Basic Executors
- 3 Executor Composers
- 7 Aligners
The base model used for fine-tuning all of the above is [LLaMA 3.1-8B](https://huggingface.co./meta-llama/Llama-3.1-8B).
## Datasets
The datasets used for evaluating all models can be found in the `datasets/raw` folder.
## Usage
Please refer to [GitHub page](https://github.com/NJUDeepEngine/CAEF) for details.
## Citation
If you use CAEF for your research, please cite our [paper](https://arxiv.org/abs/2410.07896):
```bibtex
@misc{lai2024executing,
title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines},
author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu},
year={2024},
eprint={2410.07896},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.07896},
}
``` |