File size: 5,344 Bytes

---
license: gpl-3.0
tags:
- text2text-generation
pipeline_tag: text2text-generation
language:
- zh
- en
---

Considering LLaMA's license constraints, the model is for research and learning only. 
Please strictly respect LLaMA's usage policy. 
We are not allowed to publish weights for LLaMA, of course, even finetuned, but there is no problem publishing the difference, a patch that we suggest to apply to the files. 
The encryption is a simple XOR between files, ensuring that only the people that have access to the original weights (from completely legal sources, of course) can transform them into finetuned weights. 
You can find the decrypt code on https://github.com/LianjiaTech/BELLE/tree/main/models .


# GPTQ-for-LLaMa

## Welcome
If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE !

## Model description
8 bits quantization of [BELLE-LLAMA-7B-2M](https://huggingface.co./BelleGroup/BELLE-LLAMA-7B-2M-enc) using [GPTQ](https://arxiv.org/abs/2210.17323)

GPTQ is SOTA one-shot weight quantization method.

The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/tree/main/gptq.

Basically, 8-bit quantization and 128 groupsize are recommended.

**This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**

## Model list

| model name       |  file size | GPU memory usage |
| -------------------------------------------------- |  ------------------- | ------------------ |
|           llama7b-2m                 |          26G        |       ~15G         |
|           llama7b-2m-8bit-128g.pt                  |          6.8G        |       ~8.9G          |
|           llama7b-2m-4bit-128g.pt                  |          3.8G        |        ~5.6G          |

## Check md5
1. After you git clone this model
```
md5sum ./*
340aa9ee27fa7931ccbabcc30f2f8a27  ./config.json.db303d8f096e427bd21ff97bb169c84fb3ae11336a644e3da3506419d44f6429.enc
f9b33d359f17a437f6c24b4de6f2272e  ./generation_config.json.fd7ff399e5568cc21a0a8414f43df88ef7c424995b9b97a90563165d2cf79efd.enc
591a2ecabc03530ba70663784fddb0e5  ./llama7b-2m-4bit-128g.pt.8576bae21290e9e75a60f38a6010709255656b19330a0df9a4bf50e1ee83fc51.enc
65926fddcd56be59b0bebf97f1518106  ./llama7b-2m-8bit-128g.pt.44227e0ee3633967c555ed9ba7a89f340955545f6e32f7d5dfdc28603f6e27d2.enc
1ab707fa9b0c4be294fd0b867d73e919  ./special_tokens_map.json.44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a.enc
ff291fcfa4e0048ca4ff262312faad83  ./tokenizer_config.json.ef7ef410b9b909949e96f172b17cbf7c68b11761c632715fa05a6088c0c2b9ac.enc
39ec1b33fbf9a0934a8ae0f9a24c7163  ./tokenizer.model.9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347.enc
```

2. Decrypt the files using the scripts in https://github.com/LianjiaTech/BELLE/tree/main/models

You can use the following command in Bash.
Please replace "/path/to_encrypted" with the path where you stored your encrypted file, 
replace "/path/to_original_llama_7B" with the path where you stored your original llama7B file, 
and replace "/path/to_finetuned_model" with the path where you want to save your final trained model.

```bash
mkdir /path/to_finetuned_model
for f in "/path/to_encrypted"/*; \
    do if [ -f "$f" ]; then \
       python3 decrypt.py "$f" "/path/to_original_llama_7B/consolidated.00.pth" "/path/to_finetuned_model/"; \
    fi; \
done
```

After executing the aforementioned command, you will obtain the following files.

```
./config.json
./generation_config.json
./llama7b-2m-4bit-128g.pt
./llama7b-2m-8bit-128g.pt
./special_tokens_map.json
./tokenizer_config.json
./tokenizer.model
```

3. Check md5sum

You can verify the integrity of these files by performing an MD5 checksum to ensure their complete recovery.
Here are the MD5 checksums for the relevant files:
```
md5sum ./*
32490e7229fb82c643e3a7b8d04a6c4b  ./config.json
2917a1cafb895cf57e746cfd7696bfe5  ./generation_config.json
856cb1e00b6837f71b8d77f8b44ee5a5  ./llama7b-2m-4bit-128g.pt
a35a44e6ff57e672f649635cf966f5bd  ./llama7b-2m-8bit-128g.pt
99914b932bd37a50b983c5e7c90ae93b  ./special_tokens_map.json
5526ad31f4928acb5219e295e5ff81ce  ./tokenizer_config.json
eeec4125e9c7560836b4873b6f8e3025  ./tokenizer.model
```

## Limitations
There still exists a few issues in the model trained on current base model and data:

1. The model might generate factual errors when asked to follow instructions related to facts.

2. Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions.

3. Needs improvements on reasoning and coding.

Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed.

## Citation

Please cite us when using our code, data or model.

```
@misc{BELLE,
  author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
  title = {BELLE: Bloom-Enhanced Large Language model Engine },
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}
```

Cite the original LLaMa, Stanford Alpaca and Self-Instruct papers as well!