Edit Models filters

Inference status

Misc

8-bit precision

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

Mixture of Experts

text-embeddings-inference

Carbon Emissions

Models

10,815

Full-text search

Active filters: 8-bit

Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21, 2024 • 3.4k • 12

MaziyarPanahi/solar-pro-preview-instruct-GGUF

Text Generation • Updated Sep 13, 2024 • 2.59M • 23

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 24, 2024 • 5.07k • 8

Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 1.2k • 7

Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 1.91k • 2

Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 14.5k • 12

Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 5.51k • 15

brunopio/Llama3-8B-1.58-100B-tokens-GGUF

Text Generation • Updated Sep 19, 2024 • 1.31M • 12

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8

Text Generation • Updated Nov 18, 2024 • 617 • 3

RichardErkhov/beomi_-_open-llama-2-ko-7b-8bits

Updated Sep 25, 2024 • 1

MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF

Text Generation • Updated Sep 25, 2024 • 2.59M • 7

neuralmagic/Llama-3.2-1B-Instruct-quantized.w8a8

Text Generation • Updated Oct 16, 2024 • 4.15k • 7

MikeRoz/ArliAI_Mistral-Small-22B-ArliAI-RPMax-v1.1-8.0bpw-h8-exl2

Updated Sep 26, 2024 • 8 • 2

qeternity/Qwen2.5-72B-Instruct-W8A8

Updated Oct 1, 2024 • 5 • 2

altomek/Llama-3.2-3B-Instruct-8bpw-EXL2

Text Generation • Updated Oct 8, 2024 • 31 • 1

MaziyarPanahi/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF

Text Generation • Updated Oct 20, 2024 • 176 • 2

malenia1/ternary-weight-embedding

Updated 27 days ago • 900 • 2

mlx-community/SmolLM2-135M-Instruct-8bit

Text Generation • Updated Nov 1, 2024 • 26 • 1

automatedstockminingorg/expert-on-investment-valuation-mypricermodel

Text Generation • Updated Nov 2, 2024 • 18 • 1

MaziyarPanahi/Llama-3.2-1B-GGUF

Text Generation • Updated Nov 2, 2024 • 60 • 1

Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int8

Text Generation • Updated Nov 18, 2024 • 14.4k • 15

lmstudio-community/Qwen2.5-Coder-32B-Instruct-MLX-8bit

Text Generation • Updated Nov 13, 2024 • 199 • 1

noneUsername/Qwen2.5-Coder-14B-Instruct-abliterated-W8A8-Dynamic-Per-Token

Updated Nov 14, 2024 • 41 • 1

tiiuae/Falcon3-7B-Base-1.58bit

Text Generation • Updated 22 days ago • 79 • 2

PrunaAI/distributed-optimized-gpt2-1b-bnb-8bit-smashed

Updated Dec 10, 2024 • 28 • 1

Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-MLX_8bit

Text Generation • Updated Nov 26, 2024 • 22 • 8

HuggingFaceTB/SmolLM2-135M-Instruct-Q8-mlx

Text Generation • Updated Nov 27, 2024 • 38 • 2

mlx-community/QwQ-32B-Preview-8bit

Text Generation • Updated Nov 27, 2024 • 172 • 5

marquesafonso/NuExtract-openvino-8bit

Updated Dec 2, 2024 • 4 • 1

MaziyarPanahi/Sailor2-1B-Chat-GGUF

Text Generation • Updated Dec 3, 2024 • 690k • 4