base_model: | |
- SicariusSicariiStuff/Wingless_Imp_8B | |
This is a converted weight from [Wingless_Imp_8B](https://huggingface.co./SicariusSicariiStuff/Wingless_Imp_8B) model in [unsloth 4-bit dynamic quant](https://archive.is/EFz7P) using this [collab notebook](https://colab.research.google.com/drive/1P23C66j3ga49kBRnDNlmRce7R_l_-L5l?usp=sharing). | |
## About this Conversion | |
This conversion uses **Unsloth** to load the model in **4-bit** format and force-save it in the same **4-bit** format. | |
### How 4-bit Quantization Works | |
- The actual **4-bit quantization** is handled by **BitsAndBytes (bnb)**, which works under **Torch** via **AutoGPTQ** or **BitsAndBytes**. | |
- **Unsloth** acts as a wrapper, simplifying and optimizing the process for better efficiency. | |
This allows for reduced memory usage and faster inference while keeping the model compact. | |