deepseek small ones

#636

by kalle07 - opened 3 days ago

Discussion

kalle07

3 days ago

(tiny original)
https://huggingface.co./deepseek-ai/deepseek-vl2-tiny

(8bit)
https://huggingface.co./mlx-community/deepseek-vl2-tiny-8bit/tree/main

and
(small original)
https://huggingface.co./deepseek-ai/deepseek-vl2-small/tree/main

(8bit)
https://huggingface.co./mlx-community/deepseek-vl2-small-8bit/tree/main

it give also 4bit from mlx for both version

also v3
https://huggingface.co./mlx-community/DeepSeek-V3-4bit/tree/main

i think for normal users still 4bit v3 is to big ^^

nicoboss

3 days ago

I queued deepseek-ai/deepseek-vl2-tiny and deepseek-ai/deepseek-vl2-small. The others will not work as you can't GGUF quant an already quantized model.

You can check the progress on http://hf.tst.eu/status.html

nicoboss

3 days ago

@mradermacher I don't think it actually got submitted. It mentioned something about "no architectures entry" but when checking https://huggingface.co./deepseek-ai/deepseek-vl2-tiny/blob/main/config.json and https://huggingface.co./deepseek-ai/deepseek-vl2-small/blob/main/config.json the architecture is DeepseekV2ForCausalLM which should be supported by llama.cpp if I remember correctly. I assume this happens because it is a vision model and so not supported by llama.cpp despite then text part using DeepseekV2ForCausalLM .

nico1 ~# llmc add -2007 si https://huggingface.co./deepseek-ai/deepseek-vl2-tiny
submit tokens: ["-2007","static","imatrix","https://huggingface.co./deepseek-ai/deepseek-vl2-tiny"]
https://huggingface.co./deepseek-ai/deepseek-vl2-tiny
deepseek-ai/deepseek-vl2-tiny: no architectures entry ()

nico1 ~# llmc add -2007 si https://huggingface.co./deepseek-ai/deepseek-vl2-small
submit tokens: ["-2007","static","imatrix","https://huggingface.co./deepseek-ai/deepseek-vl2-small"]
https://huggingface.co./deepseek-ai/deepseek-vl2-small
deepseek-ai/deepseek-vl2-small: no architectures entry ()

kalle07

3 days ago

why vision model like MLLM not supported by llama.cpp, is it still in deveolopent, it give such as much ^^

mradermacher

Owner 3 days ago

"no architectures entry'" literally means the architectures key is missing form config.js. This can happen when huggingface fails to deliver the file (happens rfarely) or a network problem (happens very rarely). And llmc add should not run in the dsandbox (which has no network), or does it...

mradermacher

Owner 3 days ago

btw. the nice level should be exactly -2000 normally (== it is fine to experiment), otherwise it will get absolute priority over other user-requested models.

mradermacher

Owner 3 days ago

must have been a network/hf problem, because it seems to work now (the submission part)

mradermacher

Owner 3 days ago

Ok, not sure why the submission part worked (have to check that), but the json.config clearly has no architectures key:

    model_architecture = hparams["architectures"][0]
                         ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'architectures'

It has a language_config or so key which has an architectures key (and contains something that look like a transformers config.json). In any case it's not supported by llama.cpp in this form, and would need some form of doctoring. Maybe it's as easy as replacing config.json by the contents of the language_config...

mradermacher

Owner 3 days ago

ValueError: Can not map tensor 'image_newline'

No, it's not so easy. At the very least, somehow the extra tensors would have to be ignored/removed.

mradermacher changed discussion status to closed 3 days ago

kalle07

2 days ago

seems that is working with MLX (never tryed)

https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md

like the hugingface creator
https://huggingface.co./mlx-community/deepseek-vl2-small-8bit/tree/main

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment