Edit Models filters

Inference status

Misc

Inference Endpoints

text-generation-inference

AutoTrain Compatible

Mixture of Experts

Carbon Emissions

4-bit precision

text-embeddings-inference

8-bit precision

Models

11,305

Full-text search

Active filters: llama-cpp

arrio/Gemma-2-9B-Chinese-Chat-Q2_K-GGUF

Text Generation • Updated Jul 11, 2024 • 14

yichen0104/ReluLLaMA-7B-Q4_K_M-GGUF

Updated Jul 11, 2024 • 2

Fizzarolli/writer-8b-Q4_K_S-GGUF

Updated Jul 11, 2024

mchl914/Llama-3-Taiwan-8B-Instruct-Q8_0-GGUF

Text Generation • Updated Jul 11, 2024 • 1

mchl914/Llama3-TAIDE-LX-8B-Chat-Alpha1-Q8_0-GGUF

Updated Jul 11, 2024 • 1

qizc/Phi-3-mini-4k-instruct-Q2_K-GGUF

Text Generation • Updated Jul 11, 2024 • 17

MisterSP/AlphaMist7B-slr-v4-slow2-Q4_K_M-GGUF

Updated Jul 11, 2024 • 5 • 1

martintomov/Qwen2-7B-Instruct-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024

jackchoucn/Gemma-2-9B-Chinese-Chat-Q8_0-GGUF

Text Generation • Updated Jul 11, 2024 • 2

amirm/Meta-Llama-3-8B-Instruct-Q2_K-GGUF

Text Generation • Updated Jul 11, 2024 • 4

amirm/Meta-Llama-3-8B-Q2_K-GGUF

Text Generation • Updated Jul 11, 2024 • 1

Stark2008/GutenLaserPi-Q6_K-GGUF

Updated Jul 11, 2024 • 1

Stark2008/Qwen1.5-14B-Chat-Q3_K_S-GGUF

Text Generation • Updated Jul 11, 2024 • 1

vahhab70/CodeQwen1.5-7B-Chat-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 2

HeRksTAn/Meta-Llama-3-8B-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024

sdkramer10/Meta-Llama-3-8B-Instruct-Q4_K_M-GGUF

Text Generation • Updated Jul 12, 2024 • 18

Kolapsicle/llama-3-nvidia-ChatQA-1.5-8B-Q5_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 4

netcat420/MFANNv0.17-Q4_K_M-GGUF

Updated Jul 11, 2024

MugenYume/TinyHermes-phi-3-mini-4k-instruct-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 1

Stark2008/Qwen1.5-14B-Chat-Q3_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 2

Stark2008/Qwen1.5-14B-Chat-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 2

Stark2008/HebGutenLaser-Q6_K-GGUF

Updated Jul 11, 2024

NikolayKozloff/Replete-Coder-Instruct-8b-Merged-Q8_0-GGUF

Updated Jul 11, 2024 • 11 • 1

jeiku/Zephyr_beta_32k_7B-Q5_K_S-GGUF

Updated Jul 11, 2024 • 1 • 1

HeRksTAn/Mistral-7B-v0.1-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024 • 1

HeRksTAn/Meta-Llama-3-8B-Instruct-Q4_K_S-GGUF

Text Generation • Updated Jul 11, 2024 • 20

HeRksTAn/Meta-Llama-3-8B-Instruct-Q4_K_M-GGUF

Text Generation • Updated Jul 11, 2024

archit28/bge-large-en-v1.5-Q4_K_S-GGUF

Feature Extraction • Updated Jul 12, 2024 • 1

archit28/Qwen2-1.5B-Q4_K_S-GGUF

Text Generation • Updated Jul 12, 2024 • 1

moleniuk/Meta-Llama-3-8B-Q4_K_M-GGUF

Text Generation • Updated Jul 12, 2024