Can you post the script that was used to quantize this model please?

by ctranslate2-4you - opened Sep 26

Discussion

ctranslate2-4you

Sep 26

Can you post the script that was used to quantize this model please?

shimmyshimmer

Unsloth AI org Sep 28

Can you post the script that was used to quantize this model please?

You can quantize automatically using our unsloth github package

benTow07

Oct 6

This comment has been hidden

BallisticAI

Oct 9

•

edited Oct 9

Can you post the script that was used to quantize this model please?

You can quantize automatically using our unsloth github package

Can you please add documentation on -HOW- to do that to your website? There is nothing about this on your docs site: https://docs.unsloth.ai/
This was brought up on github, and the response was effectively: "go to discord". https://github.com/unslothai/unsloth/issues/972

None of the process is properly described anywhere in your docs. Some of your quantized models are 1 safe-tensor file... others are 2 or more.
Why is that the case? Don't know, because there is no explanation of what you are doing and how you are doing it and why you are doing it that way.

shimmyshimmer

Unsloth AI org Oct 9

Can you post the script that was used to quantize this model please?

You can quantize automatically using our unsloth github package

Can you please add documentation on -HOW- to do that to your website? There is nothing about this on your docs site: https://docs.unsloth.ai/
This was brought up on github, and the response was effectively: "go to discord". https://github.com/unslothai/unsloth/issues/972

None of the process is properly described anywhere in your docs. Some of your quantized models are 1 safe-tensor file... others are 2 or more.
Why is that the case? Don't know, because there is no explanation of what you are doing and how you are doing it and why you are doing it that way.

If you use Unsloth you can quantize the models - if you save directly to 4bit. Will add a section for this but it does say in our Google Colab notebooks.

Quantized models are 1 safe-tensor file... others are 2 or more because it will be too large to download so you divide it into portions. It's what Hugging Face does as well.

shimmyshimmer

Unsloth AI org Oct 9

Can you post the script that was used to quantize this model please?

Can you post the script that was used to quantize this model please?

You can quantize automatically using our unsloth github package

Can you please add documentation on -HOW- to do that to your website? There is nothing about this on your docs site: https://docs.unsloth.ai/
This was brought up on github, and the response was effectively: "go to discord". https://github.com/unslothai/unsloth/issues/972

None of the process is properly described anywhere in your docs. Some of your quantized models are 1 safe-tensor file... others are 2 or more.
Why is that the case? Don't know, because there is no explanation of what you are doing and how you are doing it and why you are doing it that way.

Example see our Google Colab notebook for Llama 3.2 here which allows to quantize your model: https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9?usp=sharing

BallisticAI

Oct 12

Example see our Google Colab notebook for Llama 3.2 here which allows to quantize your model:

Thanks!

yashlanjewar20

25 days ago

@BallisticAI did you figure out how to load the model and get it working

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment