Requesting Support for GGUF Quantization of MiniMax-Text-01 through llama.cpp
Dear MiniMax Team,
I would like to request the support of GGUF quantization through the llama.cpp library.
As this will allow more users to use your new model.
The repo for llama.cpp can be found here: https://github.com/ggerganov/llama.cpp.
Thank you for considering this request.
Thank you for your suggestion. We are currently working on supporting our model on vLLM. Additionally, we are also considering supporting the model on more open-source frameworks. If there are any new developments, we will keep you informed
@Doctor-Chad-PhD You can try my branch: https://github.com/fairydreaming/llama.cpp/tree/minimax-text-01
Note that it's still a work in progress, currently it only supports inference of a single token sequence.
also please add model into ollama.com (with couple smaller ) ) Thank you!
thank you @sszymczyk and @MiniMax-AI