chenshake
/

Llama-2-7b-hf-GGUF

Inference Endpoints

Model card Files Files and versions Community

Edit model card

从Llama-2-7b-hf，转换成gguf格式。

notebook:

quantize-llama-2-models-using-gguf

我使用作者的colab，做了一些调整，记得要T4，不然转换的时候会出错。

使用量化后gguf模型，进行推理测试.notebook:

量化大模型进行推理测试

Downloads last month: 17

GGUF

Model size

6.74B params

Architecture

llama

4-bit

5-bit

Inference API

Unable to determine this model's library. Check the docs .