deepseek small ones

#636
by kalle07 - opened

I queued deepseek-ai/deepseek-vl2-tiny and deepseek-ai/deepseek-vl2-small. The others will not work as you can't GGUF quant an already quantized model.

You can check the progress on http://hf.tst.eu/status.html

@mradermacher I don't think it actually got submitted. It mentioned something about "no architectures entry" but when checking https://huggingface.co./deepseek-ai/deepseek-vl2-tiny/blob/main/config.json and https://huggingface.co./deepseek-ai/deepseek-vl2-small/blob/main/config.json the architecture is DeepseekV2ForCausalLM which should be supported by llama.cpp if I remember correctly. I assume this happens because it is a vision model and so not supported by llama.cpp despite then text part using DeepseekV2ForCausalLM .

nico1 ~# llmc add -2007 si https://huggingface.co./deepseek-ai/deepseek-vl2-tiny
submit tokens: ["-2007","static","imatrix","https://huggingface.co./deepseek-ai/deepseek-vl2-tiny"]
https://huggingface.co./deepseek-ai/deepseek-vl2-tiny
deepseek-ai/deepseek-vl2-tiny: no architectures entry ()

nico1 ~# llmc add -2007 si https://huggingface.co./deepseek-ai/deepseek-vl2-small
submit tokens: ["-2007","static","imatrix","https://huggingface.co./deepseek-ai/deepseek-vl2-small"]
https://huggingface.co./deepseek-ai/deepseek-vl2-small
deepseek-ai/deepseek-vl2-small: no architectures entry ()

why vision model like MLLM not supported by llama.cpp, is it still in deveolopent, it give such as much ^^

"no architectures entry'" literally means the architectures key is missing form config.js. This can happen when huggingface fails to deliver the file (happens rfarely) or a network problem (happens very rarely). And llmc add should not run in the dsandbox (which has no network), or does it...

btw. the nice level should be exactly -2000 normally (== it is fine to experiment), otherwise it will get absolute priority over other user-requested models.

must have been a network/hf problem, because it seems to work now (the submission part)

Ok, not sure why the submission part worked (have to check that), but the json.config clearly has no architectures key:

    model_architecture = hparams["architectures"][0]
                         ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'architectures'

It has a language_config or so key which has an architectures key (and contains something that look like a transformers config.json). In any case it's not supported by llama.cpp in this form, and would need some form of doctoring. Maybe it's as easy as replacing config.json by the contents of the language_config...

ValueError: Can not map tensor 'image_newline'

No, it's not so easy. At the very least, somehow the extra tensors would have to be ignored/removed.

mradermacher changed discussion status to closed

Sign up or log in to comment