mradermacher/Vikhr-Nemo-dostoevsky-saiga-12b-i1-GGUF

It uses the standard dataset as all other models, which contains some, but very little cyrillic. We don't know how this affects the quants, but it would be prudent to assume some cost to the russian language capabilities. You might want to go for the static quants. We'd be happy if somebody made some objective measurements w.r.t. this, btw., because AFAICS nobody knows how big the effect will be.

You can check which set is used (at least for quants in the last few months) as every quant contains the filename of the imatrix training data (the quant browser of huggingface should be able to show it). Our current standard set is called "imatrix-training-full-3".

mradermacher
/

Vikhr-Nemo-dostoevsky-saiga-12b-i1-GGUF

calibration dataset language