Edit Models filters

Inference status

Misc

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

Mixture of Experts

Misc with no match

text-embeddings-inference

Carbon Emissions

Models

1,422

Full-text search

Active filters: 2-bit

ChenMnZ/Llama-2-13b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 2

ChenMnZ/Llama-2-13b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 6

ChenMnZ/Llama-2-70b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 3

ChenMnZ/Llama-2-70b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 2

ChenMnZ/Llama-2-70b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 2

ChenMnZ/Llama-2-7b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 8

ChenMnZ/Llama-2-7b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 5

ChenMnZ/Llama-3-70b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 6

ChenMnZ/Llama-3-70b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 7

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 2

ChenMnZ/Llama-2-7b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 2

ChenMnZ/Llama-2-7b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-70b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-8b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 10

ChenMnZ/Llama-3-8b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22 • 41 • 1

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22 • 3

ChenMnZ/Llama-3-70b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 6

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 3

ChenMnZ/Llama-3-8b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 6

ChenMnZ/Llama-3-8b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 4

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22 • 6

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22 • 2

MaziyarPanahi/SmolLM-135M-Instruct-GGUF

Text Generation • Updated Jul 22 • 303 • 2

MaziyarPanahi/SmolLM-360M-Instruct-GGUF

Text Generation • Updated Jul 22 • 194 • 1

MaziyarPanahi/SmolLM-1.7B-Instruct-GGUF

Text Generation • Updated Jul 22 • 391 • 4

MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF

Text Generation • Updated Jul 23 • 1.04M • 14

MaziyarPanahi/Mistral-Large-Instruct-2407-GGUF

Text Generation • Updated Jul 26 • 37.3k • 20