huggingkot
/

L3-8B-Stheno-v3.2-bnb-4bit

8-bit precision

Model card Files Files and versions Community

This is a converted weight from L3-8B-Stheno-v3.2 model in unsloth 4-bit dynamic quant using this collab notebook.

About this Conversion

This conversion uses Unsloth to load the model in 4-bit format and force-save it in the same 4-bit format.

How 4-bit Quantization Works

The actual 4-bit quantization is handled by BitsAndBytes (bnb), which works under Torch via AutoGPTQ or BitsAndBytes.
Unsloth acts as a wrapper, simplifying and optimizing the process for better efficiency.

This allows for reduced memory usage and faster inference while keeping the model compact.

Downloads last month: 14

Safetensors

Model size

4.65B params

Tensor type

FP16

·

F32

·

U8

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for huggingkot/L3-8B-Stheno-v3.2-bnb-4bit

Base model

Sao10K/L3-8B-Stheno-v3.2

Finetuned

(9)

this model

Collection including huggingkot/L3-8B-Stheno-v3.2-bnb-4bit

bnb-4bit-experimental

experimental 4-bit dynamic quant using unsloth, these model will be used for inference and RoPE fine-tuning. • 9 items • Updated 4 days ago