metadata
pipeline_tag: text-generation
inference: false
tags:
- facebook
- meta
- llama
- llama-2
- mlx
CodeLlama
Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the repository for the base 7B version in the Hugging Face Transformers format. This model is designed for general code synthesis and understanding. This is the repository for the 7B Python fine-tuned model, in npz
format suitable for use in Apple's MLX framework.
Weights have been converted to float16
from the original bfloat16
type, because numpy
is not compatible with bfloat16
out of the box.
How to use with MLX.
# Install mlx, mlx-examples, huggingface-cli
pip install mlx
pip install huggingface_hub hf_transfer
git clone https://github.com/ml-explore/mlx-examples.git
# Download model
export HF_HUB_ENABLE_HF_TRANSFER=1
huggingface-cli download --local-dir CodeLlama-7b-Python-mlx mlx-llama/CodeLlama-7b-Python-mlx
# Run example
python mlx-examples/llama/llama.py CodeLlama-7b-Python-mlx/CodeLlama-7b-Python.npz CodeLlama-7b-Python-mlx/tokenizer.model "def fibonacci("
Please, refer to the original model card for details on CodeLlama.