Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Frozen
Misc
Reset Misc
compressed-tensors
Inference Endpoints
AutoTrain Compatible
text-generation-inference
8-bit precision
custom_code
Eval Results
Merge
Misc with no match
4-bit precision
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
723
Full-text search
Edit filters
Sort: Trending
Active filters:
compressed-tensors
Clear all
v2ray/GPT4chan-8B-FP8
Text Generation
•
Updated
14 days ago
•
4
leon-se/Idefics3-8B-Llama3-FP8-Dynamic
Updated
14 days ago
•
36
horheynm/TinyLlama-1.1B-Chat-v1.0-FP8_DYNAMIC-e2e
Updated
14 days ago
•
7
nm-testing/TinyLlama-1.1B-Chat-v1.0-kv_cache_default_tinyllama-e2e
Updated
about 9 hours ago
•
12
nm-testing/Phi-3-mini-4k-instruct-kv_cache_default_phi3-e2e
Updated
6 days ago
•
9
zygi/Qwen2.5-72B-Instruct-abliterated-FP8-Dynamic
Updated
14 days ago
•
42
Infermatic/70B-L3.3-mhnnn-x1-FP8-Dynamic
Text Generation
•
Updated
13 days ago
•
6
noneUsername/huihui-ai-phi-4-abliterated-W8A8
Updated
13 days ago
•
4
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-compressed
Updated
13 days ago
•
26
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-uncompressed
Updated
13 days ago
•
18
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-compressed
Updated
13 days ago
•
28
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-uncompressed
Updated
13 days ago
•
12
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-compressed
Updated
13 days ago
•
31
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-uncompressed
Updated
13 days ago
•
15
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-compressed
Updated
13 days ago
•
23
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-uncompressed
Updated
13 days ago
•
10
nm-testing/llama2.c-stories42M-quantized-fp8-Dynamic
Updated
13 days ago
•
5
nm-testing/TinyLlama-1.1B-Chat-v1.0-sparse2of4_only-e2e
Updated
about 9 hours ago
•
28
nm-testing/TinyLlama-1.1B-Chat-v1.0-sparse2of4_fp8_dynamic-e2e
Updated
4 days ago
•
5
nfunctor/SuperNova-Medius-FP8-Dynamic
Text Generation
•
Updated
12 days ago
•
59
novita-ai/dolphin-2.9.2-qwen2-72b-fp8
Updated
11 days ago
•
47
soprasteria/Lucie-7B-Instruct-FP8-KV
Updated
11 days ago
soprasteria/Mistral-Nemo-Instruct-2407-FP8-KV
Updated
11 days ago
•
70
neuralmagic/granite-3.1-2b-base-quantized.w8a8
Text Generation
•
Updated
2 days ago
•
15
neuralmagic/granite-3.1-2b-base-quantized.w4a16
Text Generation
•
Updated
2 days ago
•
22
neuralmagic/granite-3.1-2b-base-FP8-dynamic
Text Generation
•
Updated
2 days ago
•
14
neuralmagic/granite-3.1-8b-base-quantized.w8a8
Text Generation
•
Updated
2 days ago
•
7
neuralmagic/granite-3.1-8b-base-quantized.w4a16
Text Generation
•
Updated
2 days ago
•
22
stan-hua/Qwen2.5-0.5B-Instruct-LC-RTN-W4A16
Updated
11 days ago
•
3
stan-hua/Qwen2.5-0.5B-Instruct-LC-RTN-W8A8
Updated
11 days ago
•
3
Previous
1
...
18
19
20
21
22
...
25
Next