This is a converted weight from L3-8B-Stheno-v3.2 model in unsloth 4-bit dynamic quant using this collab notebook.

About this Conversion

This conversion uses Unsloth to load the model in 4-bit format and force-save it in the same 4-bit format.

How 4-bit Quantization Works

  • The actual 4-bit quantization is handled by BitsAndBytes (bnb), which works under Torch via AutoGPTQ or BitsAndBytes.
  • Unsloth acts as a wrapper, simplifying and optimizing the process for better efficiency.

This allows for reduced memory usage and faster inference while keeping the model compact.

Downloads last month
14
Safetensors
Model size
4.65B params
Tensor type
FP16
·
F32
·
U8
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for huggingkot/L3-8B-Stheno-v3.2-bnb-4bit

Finetuned
(9)
this model

Collection including huggingkot/L3-8B-Stheno-v3.2-bnb-4bit