Edit model card

Uploaded Model

  • Developed by: ar08
  • License: apache-2.0

USAGE

To use this model, follow the steps below:

  1. Install the necessary packages:

    # Install llama-cpp-python
    pip install llama-cpp-python
    
    # Install transformers from source - only needed for versions <= v4.34
    pip install git+https://github.com/huggingface/transformers.git
    
    # Install accelerate
    pip install accelerate
    
  2. Instantiate the model:

    from llama_cpp import Llama
    
    # Define the model path
    my_model_path = "your_downloaded_model_name/path"
    CONTEXT_SIZE = 512
    
    # Load the model
    model = Llama(model_path=my_model_path, n_ctx=CONTEXT_SIZE)
    
  3. Generate text from a prompt:

    def generate_text_from_prompt(user_prompt, max_tokens=100, temperature=0.3, top_p=0.1, echo=True, stop=["Q", "\n"]):
        # Define the parameters
        model_output = model(
            user_prompt,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            echo=echo,
            stop=stop,
        )
    
        return model_output["choices"][0]["text"].strip()
    
    if __name__ == "__main__":
        my_prompt = "What do you think about the inclusion policies in Tech companies?"
        model_response = generate_text_from_prompt(my_prompt)
        print(model_response)
    

Downloads last month
13
GGUF
Model size
1.1B params
Architecture
llama

4-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ar08/tinyllama-nerd-gguf