SmolLM-135M-Instruct
Original Model
HuggingFaceTB/SmolLM-135M-Instruct
Run with LlamaEdge
LlamaEdge version: v0.12.5 and above
Prompt template
Prompt type:
chatml
Prompt string
<|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant
Context size:
2048
Run as LlamaEdge service
wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-135M-Instruct-Q5_K_M.gguf \ llama-api-server.wasm \ --prompt-template chatml \ --ctx-size 2048 \ --model-name SmolLM-135M-Instruct
Run as LlamaEdge command app
wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-135M-Instruct-Q5_K_M.gguf \ llama-chat.wasm \ --prompt-template chatml \ --ctx-size 2048
Quantized GGUF Models
Name | Quant method | Bits | Size | Use case |
---|---|---|---|---|
SmolLM-135M-Instruct-Q2_K.gguf | Q2_K | 2 | 88.2 MB | smallest, significant quality loss - not recommended for most purposes |
SmolLM-135M-Instruct-Q3_K_L.gguf | Q3_K_L | 3 | 97.5 MB | small, substantial quality loss |
SmolLM-135M-Instruct-Q3_K_M.gguf | Q3_K_M | 3 | 93.5 MB | very small, high quality loss |
SmolLM-135M-Instruct-Q3_K_S.gguf | Q3_K_S | 3 | 88.2 MB | very small, high quality loss |
SmolLM-135M-Instruct-Q4_0.gguf | Q4_0 | 4 | 91.7 MB | legacy; small, very high quality loss - prefer using Q3_K_M |
SmolLM-135M-Instruct-Q4_K_M.gguf | Q4_K_M | 4 | 105 MB | medium, balanced quality - recommended |
SmolLM-135M-Instruct-Q4_K_S.gguf | Q4_K_S | 4 | 102 MB | small, greater quality loss |
SmolLM-135M-Instruct-Q5_0.gguf | Q5_0 | 5 | 105 MB | legacy; medium, balanced quality - prefer using Q4_K_M |
SmolLM-135M-Instruct-Q5_K_M.gguf | Q5_K_M | 5 | 112 MB | large, very low quality loss - recommended |
SmolLM-135M-Instruct-Q5_K_S.gguf | Q5_K_S | 5 | 110 MB | large, low quality loss - recommended |
SmolLM-135M-Instruct-Q6_K.gguf | Q6_K | 6 | 138 MB | very large, extremely low quality loss |
SmolLM-135M-Instruct-Q8_0.gguf | Q8_0 | 8 | 145 MB | very large, extremely low quality loss - not recommended |
SmolLM-135M-Instruct-f16.gguf | f16 | 16 | 271 MB |
Quantized with llama.cpp b3445.
- Downloads last month
- 203
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for second-state/SmolLM-135M-Instruct-GGUF
Base model
HuggingFaceTB/SmolLM-135M
Quantized
HuggingFaceTB/SmolLM-135M-Instruct