Edit Models filters

Inference status

Misc

8-bit precision

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

Mixture of Experts

text-embeddings-inference

Carbon Emissions

Models

8,671

Full-text search

Active filters: 8-bit

neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a8

Text Generation • Updated Oct 10 • 12.9k • 16

neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8

Text Generation • Updated Oct 23 • 686 • 2

MaziyarPanahi/gemma-2-2b-it-GGUF

Text Generation • Updated Aug 1 • 2.22M • 8

anokimchen/sd-turbo-openvino-8bit-no-calibration

Text-to-Image • Updated Aug 7 • 1

anokimchen/sd-turbo-openvino-8bit-GPT4vision-calibrated

Text-to-Image • Updated Aug 7 • 1

shuyuej/Mistral-Nemo-Instruct-2407-GPTQ-INT8

Updated Aug 7 • 72 • 1

alpindale/Meta-Llama-3.1-70B-Instruct-GPTQ-INT8

Updated Aug 13 • 280 • 2

FuturisticVibes/Rocinante-12B-v1.1-8.0bpw-h8-exl2

Updated Aug 23 • 20 • 1

Statuo/Celeste-v1.9-8bpw-EXL2

Text Generation • Updated Aug 17 • 28 • 1

MaziyarPanahi/SmolLM-1.7B-Instruct-v0.2-GGUF

Text Generation • Updated Aug 18 • 842 • 7

MaziyarPanahi/Phi-3.5-mini-instruct-GGUF

Text Generation • Updated Aug 20 • 2.21M • 6

neuralmagic/SmolLM-1.7B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 66 • 1

KhanhVan/Vistral-7B-Chat-gguf1

Text Generation • Updated Aug 24 • 30 • 1

Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21 • 3.5k • 11

FuturisticVibes/ArliAI-RPMax-12B-v1.1-8.0bpw-h8-exl2

Updated Sep 1 • 9 • 2

jadechoghari/aya-23-8B-quantized

Text Generation • Updated Sep 1 • 39 • 3

MaziyarPanahi/Yi-Coder-9B-Chat-GGUF

Text Generation • Updated Sep 4 • 2.22M • 2

MaziyarPanahi/DeepSeek-V2.5-GGUF

Text Generation • Updated Sep 11 • 28k • 4

HF1BitLLM/Llama3-8B-1.58-100B-tokens

Text Generation • Updated Sep 19 • 3.76k • 165

MaziyarPanahi/solar-pro-preview-instruct-GGUF

Text Generation • Updated Sep 13 • 2.21M • 21

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 24 • 2.09k • 7

Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9 • 2.22k • 2

Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 18 • 10.3k • 9

Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9 • 5.52k • 12

Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9 • 6k • 8

LoneStriker/Mistral-Small-Instruct-2409-8.0bpw-h8-exl2

Updated Sep 17 • 36 • 5

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • Updated Sep 18 • 2.2M • 2

MaziyarPanahi/Qwen2.5-7B-Instruct-GGUF

Text Generation • Updated Sep 18 • 2.21M • 8

brunopio/Llama3-8B-1.58-100B-tokens-GGUF

Text Generation • Updated Sep 19 • 1.02M • 12

MaziyarPanahi/Llama-3.2-3B-Instruct-GGUF

Text Generation • Updated Sep 25 • 2.22M • 8