Safetensors
File size: 1,247 Bytes
eeed20a
 
 
dc809cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c300c12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---

license: llama3.1
---


# Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

This repository contains the models and datasets used in the paper *"Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"*.

## Models

The `ckpt` folder contains 16 LoRA adapters that were fine-tuned for this research:

- 6 Basic Executors
- 3 Executor Composers
- 7 Aligners

The base model used for fine-tuning all of the above is [LLaMA 3.1-8B](https://huggingface.co./meta-llama/Llama-3.1-8B).


## Datasets

The datasets used for evaluating all models can be found in the `datasets/raw` folder.

## Usage

Please refer to [GitHub page](https://github.com/NJUDeepEngine/CAEF) for details.

## Citation

If you use CAEF for your research, please cite our [paper](https://arxiv.org/abs/2410.07896):
```bibtex

@misc{lai2024executing,

      title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines}, 

      author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu},

      year={2024},

      eprint={2410.07896},

      archivePrefix={arXiv},

      primaryClass={cs.AI},

      url={https://arxiv.org/abs/2410.07896}, 

}

```