--- language: - ru base_model: - t-tech/T-lite-it-1.0 tags: - llama-cpp --- # T-lite-it-1.0-Q8_0-GGUF **🚨 T-lite is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.** ## Description This repository contains the [`T-lite-it-1.0`](https://huggingface.co./t-tech/T-lite-it-1.0/) model, which has been quantized into the GGUF format using the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) repository. ## πŸ“Š Benchmarks Detailed evaluation results of oringal model can be found in our [habr post](https://habr.com/ru/companies/tbank/articles/865582/). | Benchmark | T-lite-it-1.0 | T-lite-it-1.0-Q8_0 | |------------------------------------------------|:------------------------:|:-----------------------------:| | Arena-Hard-Ru | **64.38** (-2.1, 2.5) | 64.21 (-2.2, 2.7) | ## Llama.cpp usage ### Server From HF: ```bash llama-server --hf-repo t-tech/T-lite-it-1.0-Q8_0-GGUF --hf-file t-lite-it-1.0-q8_0.gguf -c 8192 ``` Or locally: ```bash ./build/bin/llama-server -m t-lite-it-1.0-q8_0.gguf -c 8192 ``` ### POST ```bash curl --request POST \ --url http://localhost:8080/completion \ --header "Content-Type: application/json" \ --data '{ "prompt": "<|im_start|>user\nРасскаТи ΠΌΠ½Π΅ Ρ‡Π΅ΠΌ отличаСтся Python ΠΎΡ‚ C++?\n<|im_end|>\n<|im_start|>assistant\n", "n_predict": 256 }' ``` ## ollama usage ### Serve ```bash ollama serve ``` ### Run From HF: ```bash ollama run hf.co/t-tech/T-lite-it-1.0-Q8_0-GGUF:Q8_0 "РасскаТи ΠΌΠ½Π΅ ΠΏΡ€ΠΎ отличия C++ ΠΈ Python" ``` Or locally: ```bash ollama create example -f Modelfile ollama run example "РасскаТи ΠΌΠ½Π΅ ΠΏΡ€ΠΎ отличия C++ ΠΈ Python" ``` where `Modelfile` is ```bash FROM ./t-lite-it-1.0-q8_0.gguf ```