MuntasirHossain
/

Meta-Llama-3-8B-OpenOrca-GGUF

Inference Endpoints

Model card Files Files and versions Community

MuntasirHossain commited on May 19

Commit

9c562f3

•

1 Parent(s): f7ce236

Create README.md

Files changed (1) hide show

README.md +37 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+# Model description
+This is a GGUF version of the [Meta-Llama-3-8B-OpenOrca](https://huggingface.co/MuntasirHossain/Meta-Llama-3-8B-OpenOrca) model which itself is a fine-tuned version of the [meta-llama/Meta-Llama-3-8B](meta-llama/Meta-Llama-3-8B) on 1.5k subsamples of the
+[OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) dataset.
+This LLM follows the popular follows the ChatML template!
+# How to use
+````
+# Download the Q4_K_M.gguf or Q6_K.gguf version of the MuntasirHossain/Meta-Llama-3-8B-OpenOrca-GGUF model
+!huggingface-cli download MuntasirHossain/Meta-Llama-3-8B-OpenOrca-GGUF Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
+from llama_cpp import Llama
+llm = Llama(
+  model_path="./content/Q4_K_M.gguf",
+  n_ctx=0,  # input text context length, 0 = from model
+  verbose = False
+)
+# Define a function for inference
+def llm_response(input_text = '', max_tokens=256):
+  system_prompt = "You are a helpful AI assistant."
+  prompt = f"<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{input_text}<|im_end|>\n<|im_start|>assistant"
+  output = llm(
+      prompt,
+      max_tokens=max_tokens,
+      stop=["<|im_end|>"],
+      )
+  return output
+# generate model response
+input_text = "Explain artificial general intelligence (AGI) in a few lines."
+result = llm_response(input_text)
+result['choices'][0]['text']
+````